Checkpoint Saving and Loading Guide¶
Overview¶
This guide explains how models, checkpoints, and configurations are saved and loaded in the Mistral NER project.
Model Saving Behavior¶
During Training¶
- Regular Checkpoints: Saved to
output_dir
(default:./mistral-ner-finetuned/
) - Contains: LoRA adapter weights only
- Files:
adapter_config.json
,adapter_model.safetensors
-
Size: Small (~10-50MB)
-
Final Model: Saved to
final_output_dir
(default:./mistral-ner-finetuned-final/
) - Contains: LoRA adapter weights
-
Also includes: Tokenizer and config.yaml
-
Merged Model (NEW): If
merge_adapters_on_save: true
(default) - Saved to:
./mistral-ner-finetuned-final-merged/
- Contains: Complete model with LoRA weights merged
- Size: Large (~14GB for Mistral-7B)
- Ready for deployment without base model
File Structure¶
# Checkpoint directory
mistral-ner-finetuned/checkpoint-500/
├── adapter_config.json # LoRA configuration
├── adapter_model.safetensors # LoRA weights only
├── config.yaml # Training configuration
├── tokenizer_config.json
├── special_tokens_map.json
└── tokenizer.json
# Final model directory (adapters only)
mistral-ner-finetuned-final/
├── adapter_config.json
├── adapter_model.safetensors
├── config.yaml
├── tokenizer_config.json
├── special_tokens_map.json
└── tokenizer.json
# Merged model directory (full model)
mistral-ner-finetuned-final-merged/
├── config.json # Model configuration
├── model.safetensors # Complete merged weights
├── config.yaml # Training configuration
├── tokenizer_config.json
├── special_tokens_map.json
└── tokenizer.json
Loading Models¶
For Inference¶
The inference script automatically detects the model type:
# Loading adapter-only model (requires base model)
python scripts/inference.py \
--model-path ./mistral-ner-finetuned-final \
--base-model mistralai/Mistral-7B-v0.3 \
--text "John works at Microsoft"
# Loading merged model (standalone)
python scripts/inference.py \
--model-path ./mistral-ner-finetuned-final-merged \
--text "John works at Microsoft"
Programmatically¶
from src.model import load_model_from_checkpoint, setup_model
# Load adapter model
model, tokenizer = load_model_from_checkpoint(
checkpoint_path="./mistral-ner-finetuned-final",
config=config,
base_model_name="mistralai/Mistral-7B-v0.3"
)
# Load merged model
model, tokenizer = setup_model(
model_name="./mistral-ner-finetuned-final-merged",
config=config
)
Configuration Options¶
Enable/Disable Adapter Merging¶
In your config file:
Or via command line:
Resume Training¶
# Resume from specific checkpoint
python scripts/train.py --resume-from-checkpoint ./mistral-ner-finetuned/checkpoint-500
# Resume from last checkpoint (if configured)
python scripts/train.py
Best Practices¶
- For Development: Use adapter-only models (smaller, faster to save/load)
- For Deployment: Use merged models (no dependency on base model)
- For Fine-tuning: Keep adapter format for continued training
- For Inference: Merged models are slightly faster
Storage Requirements¶
- Base Mistral-7B model: ~14GB
- LoRA adapters: ~10-50MB
- Merged model: ~14GB
- Training checkpoints: ~50MB each
Troubleshooting¶
Issue: Model too large to save¶
- Solution: Disable adapter merging with
merge_adapters_on_save: false
Issue: Can't load model for inference¶
- Check if it's an adapter model (needs base model) or merged model
- For adapters: Provide
--base-model
parameter
Issue: Resume training fails¶
- Ensure checkpoint contains required files
- Check that config versions match
Technical Details¶
What Gets Saved¶
- Model Weights:
- Adapters: Only LoRA matrices (Q, K, V, O projections)
-
Merged: Complete model weights
-
Tokenizer:
- Vocabulary
- Special tokens
-
Model max length (from config)
-
Configuration:
- Training hyperparameters
- Model architecture
- Data settings
Loading Process¶
- Checkpoint Detection: Check for
adapter_config.json
- Base Model Loading: Load with quantization if configured
- Adapter Application: Apply LoRA weights to base model
- Tokenizer Setup: Configure padding and max length
- Validation: Ensure all components are compatible