Skip to content

Mistral NER DocumentationΒΆ

Mistral NER Logo License: MIT Python 3.11 Hugging Face GitHub Stars

Welcome to Mistral NERΒΆ

Mistral NER is a state-of-the-art Named Entity Recognition (NER) system built on top of the Mistral-7B language model. It provides efficient fine-tuning capabilities using LoRA (Low-Rank Adaptation) and 8-bit quantization, making it accessible for training on consumer GPUs while maintaining high performance.

🎯 Key Features¢

  • πŸš€ High Performance

    Achieve F1 scores of 85%+ on standard NER benchmarks with optimized loss functions for handling class imbalance

  • πŸ’Ύ Memory Efficient

    Train on consumer GPUs with 8-bit quantization and LoRA, reducing memory usage by up to 75%

  • πŸ“š Multi-Dataset Support

    Built-in support for 9 datasets including CoNLL-2003, OntoNotes, and specialized PII detection datasets

  • βš™ Advanced Optimization

    Integrated hyperparameter optimization with Ray Tune and Optuna for finding optimal configurations

  • πŸ“ˆ Production Ready

    REST API, comprehensive logging, model versioning, and deployment guides for production use

  • 🀝 Easy to Extend

    Clean architecture with registry patterns makes adding new datasets and loss functions straightforward

πŸš€ Quick StartΒΆ

Get up and running in just a few minutes:

# Install with CUDA support
pip install -e ".[cuda12]"

# Download and prepare data
python scripts/prepare_data.py

# Start training with default configuration
python scripts/train.py

# Run inference on your text
python scripts/inference.py --text "Apple Inc. CEO Tim Cook announced new products in Cupertino."

πŸ“š Documentation StructureΒΆ

For BeginnersΒΆ

For PractitionersΒΆ

For Advanced UsersΒΆ

πŸ† Performance HighlightsΒΆ

Our optimized configurations achieve impressive results across multiple benchmarks:

Dataset F1 Score Precision Recall
CoNLL-2003 91.2% 90.8% 91.6%
OntoNotes 5.0 88.5% 87.9% 89.1%
WNUT-17 85.3% 84.7% 85.9%

πŸ› οΈ Architecture OverviewΒΆ

graph LR
    A[Input Text] --> B[Mistral-7B Base]
    B --> C[LoRA Adapters]
    C --> D[Token Classification Head]
    D --> E[NER Predictions]

    F[8-bit Quantization] --> B
    G[Focal Loss] --> D

    style A fill:#f9f,stroke:#333,stroke-width:2px
    style E fill:#9f9,stroke:#333,stroke-width:2px

🀝 Contributing¢

We welcome contributions! Check out our Contributing Guide to get started.

πŸ“– CitationΒΆ

If you use Mistral NER in your research, please cite:

@software{mistral_ner,
  author = {Nevedomski, Sergei},
  title = {Mistral NER: Efficient Named Entity Recognition with Mistral-7B},
  year = {2024},
  url = {https://github.com/nevedomski/mistral_ner}
}

πŸ“ LicenseΒΆ

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❀️ by the Mistral NER community