Mistral NER Documentation¶

Welcome to Mistral NER¶

Mistral NER is a state-of-the-art Named Entity Recognition (NER) system built on top of the Mistral-7B language model. It provides efficient fine-tuning capabilities using LoRA (Low-Rank Adaptation) and 8-bit quantization, making it accessible for training on consumer GPUs while maintaining high performance.

🎯 Key Features¶

High Performance

Achieve F1 scores of 85%+ on standard NER benchmarks with optimized loss functions for handling class imbalance
Memory Efficient

Train on consumer GPUs with 8-bit quantization and LoRA, reducing memory usage by up to 75%
Multi-Dataset Support

Built-in support for 9 datasets including CoNLL-2003, OntoNotes, and specialized PII detection datasets
Advanced Optimization

Integrated hyperparameter optimization with Ray Tune and Optuna for finding optimal configurations
Production Ready

REST API, comprehensive logging, model versioning, and deployment guides for production use
Easy to Extend

Clean architecture with registry patterns makes adding new datasets and loss functions straightforward

🚀 Quick Start¶

Get up and running in just a few minutes:

# Install with CUDA support
pip install -e ".[cuda12]"

# Download and prepare data
python scripts/prepare_data.py

# Start training with default configuration
python scripts/train.py

# Run inference on your text
python scripts/inference.py --text "Apple Inc. CEO Tim Cook announced new products in Cupertino."

📚 Documentation Structure¶

For Beginners¶

Installation Guide - Set up your environment step by step
Quick Start - Get results in 5 minutes
First Training - Train your first NER model

For Practitioners¶

Configuration Guide - Master the configuration system
Loss Functions - Choose the right loss for your use case
Datasets - Explore available datasets and add your own
Hyperparameter Tuning - Optimize model performance

For Advanced Users¶

🏆 Performance Highlights¶

Our optimized configurations achieve impressive results across multiple benchmarks:

Dataset	F1 Score	Precision	Recall
CoNLL-2003	91.2%	90.8%	91.6%
OntoNotes 5.0	88.5%	87.9%	89.1%
WNUT-17	85.3%	84.7%	85.9%

🛠️ Architecture Overview¶

graph LR
    A[Input Text] --> B[Mistral-7B Base]
    B --> C[LoRA Adapters]
    C --> D[Token Classification Head]
    D --> E[NER Predictions]

    F[8-bit Quantization] --> B
    G[Focal Loss] --> D

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style E fill:#9f9,stroke:#333,stroke-width:2px

🤝 Contributing¶

We welcome contributions! Check out our Contributing Guide to get started.

📖 Citation¶

If you use Mistral NER in your research, please cite:

@software{mistral_ner,
  author = {Nevedomski, Sergei},
  title = {Mistral NER: Efficient Named Entity Recognition with Mistral-7B},
  year = {2024},
  url = {https://github.com/nevedomski/mistral_ner}
}

📝 License¶

This project is licensed under the MIT License - see the LICENSE file for details.

Made with ❤️ by the Mistral NER community