AI Researcher (Multimodal Audio/Video Generation)

tavus· Engineering, Product, & Design
Apply Now ↗
📍 San FranciscoFullTime

About this role

About Us

Tavus is a research lab pioneering human computing. We’re building AI Humans: a new interface that closes the gap between people and machines, free from the friction of today’s systems. Our real-time human simulation models let machines see, hear, respond, and even look real—enabling meaningful, face-to-face conversations. AI Humans combine the emotional intelligence of humans with the reach and reliability of machines, making them capable, trusted agents available 24/7, in every language, on our terms.

Imagine a therapist anyone can afford. A personal trainer that adapts to your schedule. A fleet of medical assistants that can give every patient the attention they need. With Tavus, individuals, enterprises, and developers can all build AI Humans to connect, understand, and act with empathy at scale.

We’re a Series A company backed by world-class investors including Sequoia Capital, Y Combinator, and Scale Venture Partners.

Be part of shaping a future where humans and machines truly understand each other.

The Role
We’re hiring a Senior AI Researcher to lead research in audio-visual avatar generation. This role is for someone who thrives in ambiguity, has a track record of pushing generative models to new frontiers, and wants to define what human–AI interaction looks like in practice.

Your Mission 🚀

  • Lead research efforts on audio-visual generation for avatars (Neural Avatars, Talking-Heads), with a focus on conversational settings.

  • Design models that are coupled with conversation flow — capturing and generating verbal + non-verbal signals in sync.

  • Drive innovation in diffusion models, long-video generation, and audio-visual modeling.

  • Translate research into production by partnering with Applied ML and engineering.

  • Mentor researchers, set research directions, and publish impactful work.

You’ll Bring:

  • A PhD or equivalent research experience, plus 2–3+ years of hands-on experience applying generative models at scale.

  • Expertise in diffusion models and awareness of the latest efficiency techniques.

  • Experience in multimodal generation — spanning video, audio, and language.

  • Proven innovation in long-video generation and/or audio generation.

  • Excellent programming skills — fluent in PyTorch and GPU-optimized workflows.

  • Track record of publications in top-tier venues (CVPR, NeurIPS, BMVC, ICASSP, etc.).

  • Experience leading research activities or mentoring teams.

Nice-to-Haves

  • Skills in 3D graphics, Gaussian splatting, or large-scale training setups.

  • Broad exposure to generative AI models beyond your specialty.

  • Familiarity with software development best practices.

Location
Preferred: San Francisco (hybrid) or London (office opening soon). Remote within U.S. or Europe considered for exceptional candidates.

Frequently Asked Questions

Is the salary disclosed for the AI Researcher (Multimodal Audio/Video Generation) position at tavus?
The salary for this AI Researcher (Multimodal Audio/Video Generation) role at tavus is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the AI Researcher (Multimodal Audio/Video Generation) position at tavus located?
This AI Researcher (Multimodal Audio/Video Generation) role at tavus is based in San Francisco. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the AI Researcher (Multimodal Audio/Video Generation) role at tavus full-time or part-time?
This is listed as a FullTime position. It is posted as a AI Researcher (Multimodal Audio/Video Generation) role in the Engineering, Product, & Design department at tavus.
Which team or department does the AI Researcher (Multimodal Audio/Video Generation) at tavus belong to?
This AI Researcher (Multimodal Audio/Video Generation) position is part of the Engineering, Product, & Design department at tavus. See the full job description for more information about the team structure and responsibilities.
How do I apply for the AI Researcher (Multimodal Audio/Video Generation) position at tavus?
Click the "Apply Now" button on this page. You will be redirected to tavus's official application portal hosted on ashby where you can submit your application directly.
When was the AI Researcher (Multimodal Audio/Video Generation) job at tavus posted?
This AI Researcher (Multimodal Audio/Video Generation) position at tavus was posted on May 15, 2026. Apply as soon as possible — early applications are often reviewed first.
AI Researcher (Multimodal Audio/Video Generation)
tavus
Apply for this role ↗

You'll be redirected to tavus's official application page on Ashby ATS.