top of page

AI / LLM Engineer

City

Paris

Start

ASAP

Sector

Bank

Workspace

Hybrid

Experiences

XP + 6 Years

Home office

4 days per week

Eva-May

Headhunter without weapons & violence 

Job description

Two main projects:
 
1. Multilabel mail classification:
Objective: Offer a reliable, high-performance MULTILABEL classification solution for a large number of categories, and integrate it into a package designed to be industrialized and put into production:
- Modeling and evaluation of a baseline: 
Vectorization: TF-IDF, Word2Vec, contextual embeddings.
Models: Random Forest, SVM, XGBoost (with Grid Search)
- Active learning: Entropy-based approach and density weighting
- Transformer models (selected models): Transformers (CamemBERT, Flaubert, Camemberta)
- Library packaging: Python, PyTorch & Torch Lightning
- Strict development standards: SOLID + pep8 standards, code quality (no dead code, code in the right place, ease of use, modularity) pre-commit, vulture, Black, Isort, pylint, unit testing, Bandit, SonarQube, gitlab CI.
- Production release of model and complete package.
- Part 2: Generating automatic responses with LLMs
 
2. Email segmentation:
Objective: Improve the accuracy of email segmentation compared to a regular expression-based approach:
- Analysis and evaluation of regular expressions
Identify strengths and weaknesses of regular expressions
Calculation of performance metrics (precision, recall, BLUE Score, CHRF etc...)
 
- Development of AI models:
Exploration of different model architectures for classification and NER: LSTM models, then transformers.
Implementation of models using the Hugging Face api (Trainer etc...).

Stack

NLP; Machine Learning Modeling; Deep Learning; PyTorch; Torch Lightning; Transformers; Hugging Face; GitLab CI; DevOps / MLOps; spaCy; scikit-learn; torch; lightning; HuggingFace; Pandas; NumPy; seaborn

You are interested in the offer👇

bottom of page