Mistral NER DocumentationΒΆ
Welcome to Mistral NERΒΆ
Mistral NER is a state-of-the-art Named Entity Recognition (NER) system built on top of the Mistral-7B language model. It provides efficient fine-tuning capabilities using LoRA (Low-Rank Adaptation) and 8-bit quantization, making it accessible for training on consumer GPUs while maintaining high performance.
π― Key FeaturesΒΆ
-
High Performance
Achieve F1 scores of 85%+ on standard NER benchmarks with optimized loss functions for handling class imbalance
-
Memory Efficient
Train on consumer GPUs with 8-bit quantization and LoRA, reducing memory usage by up to 75%
-
Multi-Dataset Support
Built-in support for 9 datasets including CoNLL-2003, OntoNotes, and specialized PII detection datasets
-
Advanced Optimization
Integrated hyperparameter optimization with Ray Tune and Optuna for finding optimal configurations
-
Production Ready
REST API, comprehensive logging, model versioning, and deployment guides for production use
-
Easy to Extend
Clean architecture with registry patterns makes adding new datasets and loss functions straightforward
π Quick StartΒΆ
Get up and running in just a few minutes:
# Install with CUDA support
pip install -e ".[cuda12]"
# Download and prepare data
python scripts/prepare_data.py
# Start training with default configuration
python scripts/train.py
# Run inference on your text
python scripts/inference.py --text "Apple Inc. CEO Tim Cook announced new products in Cupertino."
π Documentation StructureΒΆ
For BeginnersΒΆ
- Installation Guide - Set up your environment step by step
- Quick Start - Get results in 5 minutes
- First Training - Train your first NER model
For PractitionersΒΆ
- Configuration Guide - Master the configuration system
- Loss Functions - Choose the right loss for your use case
- Datasets - Explore available datasets and add your own
- Hyperparameter Tuning - Optimize model performance
For Advanced UsersΒΆ
π Performance HighlightsΒΆ
Our optimized configurations achieve impressive results across multiple benchmarks:
Dataset | F1 Score | Precision | Recall |
---|---|---|---|
CoNLL-2003 | 91.2% | 90.8% | 91.6% |
OntoNotes 5.0 | 88.5% | 87.9% | 89.1% |
WNUT-17 | 85.3% | 84.7% | 85.9% |
π οΈ Architecture OverviewΒΆ
graph LR
A[Input Text] --> B[Mistral-7B Base]
B --> C[LoRA Adapters]
C --> D[Token Classification Head]
D --> E[NER Predictions]
F[8-bit Quantization] --> B
G[Focal Loss] --> D
style A fill:#f9f,stroke:#333,stroke-width:2px
style E fill:#9f9,stroke:#333,stroke-width:2px
π€ ContributingΒΆ
We welcome contributions! Check out our Contributing Guide to get started.
π CitationΒΆ
If you use Mistral NER in your research, please cite:
@software{mistral_ner,
author = {Nevedomski, Sergei},
title = {Mistral NER: Efficient Named Entity Recognition with Mistral-7B},
year = {2024},
url = {https://github.com/nevedomski/mistral_ner}
}
π LicenseΒΆ
This project is licensed under the MIT License - see the LICENSE file for details.