Senior Applied ML Engineer (Speech & Audio)

Apply Now ↗

About this role

Company Description

Project Overview

Join a cutting-edge initiative focused on building advanced AI voice infrastructure for Arabic-speaking markets. The project involves developing state-of-the-art Arabic speech technologies, including:

  • Natural Text-to-Speech (TTS)
  • Real-Time Automatic Speech Recognition (ASR)
  • End-to-End Speech-to-Speech Conversational Systems

The solutions are tailored to regional Arabic dialects, including Egyptian, Gulf, Levantine, and others.

Job Description

Job Description

We are seeking a highly skilled Senior Applied Machine Learning Engineer with deep expertise in speech and audio technologies. In this role, you will design, fine-tune, and optimize advanced machine learning models for Arabic voice applications. You will work across the full development lifecycle, from data pipeline construction and model experimentation to inference optimization and production deployment.

This position is ideal for engineers who are passionate about transforming cutting-edge research into scalable, low-latency systems that support natural and accurate Arabic speech interactions.

Key Responsibilities

  • Benchmark and evaluate TTS and ASR models using Arabic-specific test sets, measuring metrics such as Word Error Rate (WER), naturalness, and dialect coverage.
  • Fine-tune generative models for voice cloning, zero-shot speaker adaptation, and speech synthesis.
  • Build and maintain Arabic-focused data pipelines, including:
    • Audio collection and preprocessing
    • Diacritization (Tashkil)
    • Data cleaning and augmentation
  • Optimize model inference for production environments using:
    • Quantization
    • KV-cache tuning
    • Streaming inference techniques
  • Integrate and evaluate complete speech-to-speech conversational pipelines.
  • Conduct experiments based on recent research papers and convert findings into production-ready solutions.
  • Collaborate with engineering and product teams to deploy robust and scalable speech systems.

Qualifications

Required Qualifications

  • 5+ years of experience in Machine Learning, Applied AI, or AI Research.
  • Strong programming skills in Python.
  • Extensive hands-on experience with PyTorch and the Hugging Face ecosystem.
  • Proven experience training and fine-tuning neural models for:
    • Text-to-Speech (TTS)
    • Automatic Speech Recognition (ASR)
    • Audio codecs
  • Deep understanding of modern speech architectures such as:
    • Whisper
    • Conformer
    • HiFi-GAN
    • Diffusion-based models
  • Experience with audio processing techniques including:
    • Voice Activity Detection (VAD)
    • Speaker Diarization
    • Neural Vocoders
  • Demonstrated ability to implement and adapt research papers into practical production experiments.
  • Strong understanding of Arabic language challenges, including:
    • Diacritization (Tashkil)
    • Dialectal variations
    • Code-switching
  • Experience with inference optimization techniques such as:
    • Quantization
    • Streaming inference
    • NVIDIA TensorRT

Preferred Qualifications

  • Experience developing custom NVIDIA CUDA kernels for high-performance model inference.
  • Familiarity with speculative decoding and other advanced acceleration techniques.
  • Experience deploying models at scale in cloud or GPU-based production environments.
  • Contributions to open-source speech or machine learning projects.

 

 

 

Additional Information

WHY YOU’LL LOVE US

  • All employees benefits for free (our famous games room, daily breakfast, fruits, coffee and other hot drinks, soft drinks and juices, company days out and parties…)
  • Social insurance
  • Open-door management policy
  • Full Medical insurance
  • Accommodation and Transportation Allowance
  • Friendly environment that values innovation and efficiency
  • Exciting opportunities for career growth and talent development
  • Feedback encouragement
  • Recognition and reward programs
  • Competitive salaries and incentives
  • Friendly environment
  • Flexible and Comfortable schedule
  • Fun committees
  • Monetary rewards
  • Fun, smart and creative people
  • Career possibilities with growing team
  • Paid vacations
  • Social benefits

For more information about Nile Bits, please visit our website:

https://www.nilebits.com

Frequently Asked Questions

Is the salary disclosed for the Senior Applied ML Engineer (Speech & Audio) position at nilebits?
The salary for this Senior Applied ML Engineer (Speech & Audio) role at nilebits is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Is the Senior Applied ML Engineer (Speech & Audio) job at nilebits remote?
Yes, this Senior Applied ML Engineer (Speech & Audio) position at nilebits is remote, with team members based in Cairo, Cairo Governorate, Cairo, Cairo Governorate, Egypt, eg. You can work from home or anywhere in the supported regions.
Is the Senior Applied ML Engineer (Speech & Audio) role at nilebits full-time or part-time?
This is listed as a Full time position. It is posted as a Senior Applied ML Engineer (Speech & Audio) role at nilebits.
How do I apply for the Senior Applied ML Engineer (Speech & Audio) position at nilebits?
Click the "Apply Now" button on this page. You will be redirected to nilebits's official application portal hosted on smartrecruiters where you can submit your application directly.
When was the Senior Applied ML Engineer (Speech & Audio) job at nilebits posted?
This Senior Applied ML Engineer (Speech & Audio) position at nilebits was posted on May 18, 2026. Apply as soon as possible — early applications are often reviewed first.
Senior Applied ML Engineer (Speech & Audio)
nilebits
Apply for this role ↗

You'll be redirected to nilebits's official application page on SmartRecruiters.