Machine Learning Engineer — Multilingual Data

featherlessai· Research
Apply Now ↗
🌍 Remote📍 Remote (world)FullTime

About this role

We’re looking for a Machine Learning Engineer to own and scale our multilingual data pipeline—from sourcing and curation to evaluation and continuous improvement. You’ll work closely with researchers and infra engineers to ensure our models perform robustly across languages, scripts, and cultural contexts.

This role sits at the intersection of data, research, and production ML and is ideal for someone who cares deeply about data quality, linguistic diversity, and model generalization beyond English.

What You’ll Do

  • Design, build, and maintain large-scale multilingual datasets across high- and low-resource languages

  • Develop data pipelines for collection, cleaning, normalization, deduplication, and labeling

  • Implement quality filters using statistical, heuristic, and model-based methods

  • Work with researchers to define language coverage, benchmarks, and evaluation metrics

  • Analyze dataset bias, coverage gaps, and failure modes across regions and scripts

  • Support training, fine-tuning, and distillation workflows with high-quality multilingual data

  • Continuously iterate on datasets based on model performance and real-world usage

What We’re Looking For

  • 3+ years of experience as an ML Engineer, Applied Scientist, or similar role

  • Strong experience working with multilingual or non-English datasets

  • Solid understanding of NLP fundamentals (tokenization, embeddings, language modeling)

  • Experience building scalable data pipelines (Python, Spark, Ray, or similar)

  • Familiarity with Unicode, scripts, tokenization challenges, and language-specific quirks

  • Comfort collaborating with researchers and translating research needs into production systems

Nice to Have

  • Experience with low-resource languages or multilingual benchmarks (e.g. FLORES, XTREME)

  • Exposure to LLM training, fine-tuning, or distillation

  • Linguistics background or experience working with native language experts

  • Contributions to open-source datasets or ML tooling

  • Experience with data quality evaluation at scale

Why Join

  • Real ownership over a core differentiator of the product

  • Work on models used globally, not just in English-speaking markets

  • Small, high-caliber team with deep ML and systems experience

  • Competitive compensation + meaningful equity at Series A stage

Frequently Asked Questions

Is the salary disclosed for the Machine Learning Engineer — Multilingual Data position at featherlessai?
The salary for this Machine Learning Engineer — Multilingual Data role at featherlessai is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Is the Machine Learning Engineer — Multilingual Data job at featherlessai remote?
Yes, this Machine Learning Engineer — Multilingual Data position at featherlessai is remote, with team members based in Remote (world). You can work from home or anywhere in the supported regions.
Is the Machine Learning Engineer — Multilingual Data role at featherlessai full-time or part-time?
This is listed as a FullTime position. It is posted as a Machine Learning Engineer — Multilingual Data role in the Research department at featherlessai.
Which team or department does the Machine Learning Engineer — Multilingual Data at featherlessai belong to?
This Machine Learning Engineer — Multilingual Data position is part of the Research department at featherlessai. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Machine Learning Engineer — Multilingual Data position at featherlessai?
Click the "Apply Now" button on this page. You will be redirected to featherlessai's official application portal hosted on ashby where you can submit your application directly.
When was the Machine Learning Engineer — Multilingual Data job at featherlessai posted?
This Machine Learning Engineer — Multilingual Data position at featherlessai was posted on Jan 22, 2026. Apply as soon as possible — early applications are often reviewed first.
Machine Learning Engineer — Multilingual Data
featherlessai
Apply for this role ↗

You'll be redirected to featherlessai's official application page on Ashby ATS.