AI Evaluation Program Manager

🌍 Remote📍 San FranciscoFullTime💰 USD 150K–160K/yr🗓 Posted Jun 3, 2026

About this role

Who We Are:

At Twelve Labs, we are pioneering the development of cutting-edge multimodal foundation models that have the ability to comprehend videos just like humans do. Our models have redefined the standards in video-language modeling, empowering us with more intuitive and far-reaching capabilities, and fundamentally transforming the way we interact with and analyze various forms of media.

With a remarkable $107 million in Seed and Series A funding, our company is backed by top-tier venture capital firms such as NVIDIA’s NVentures, NEA, Radical Ventures, and Index Ventures, and prominent AI visionaries and founders such as Fei-Fei Li, Silvio Savarese, Alexandr Wang and more. Headquartered in San Francisco, with an influential APAC presence in Seoul, our global footprint underscores our commitment to driving worldwide innovation.

We are a global company that values the uniqueness of each person’s journey. It is the differences in our cultural, educational, and life experiences that allow us to constantly challenge the status quo. We are looking for individuals who are motivated by our mission and eager to make an impact as we push the bounds of technology to transform the world. Join us as we revolutionize video understanding and multimodal AI.

About the Role:

You will be a vital member of our ML Data Team – which leads the full spectrum of video-language data preparation and model evaluation. This role comes with high ownership and includes responsibilities such as defining dataset needs and requirements in consultation with our research and product teams; designing and building data pipelines; and driving our post-training model evaluation strategy. You will also be responsible for automating as much of the repetitive partnership, annotation, and quality evaluation work as possible. A desire to work cross functionally and to build relationships is critical for success in this position.

You will:

Model Evaluation: Design and build robust model evaluation frameworks, automating repetitive processes and maintaining a balanced approach to efficiency and depth in obtaining evaluation metrics and feedback.
Portfolio Monitoring: Manage resource allocation and timelines, adjusting direction flexibly based on real-time information across all data streams in your product vertical.
External Partner Collaboration: Enhance dataset and process quality through seamless collaboration with vendors and outsourcing partners.
Data Quality & Tooling Advancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure to build a sustainable data operations framework.
Internal Collaboration: Partner with Engineering and AI Model teams to align on top priority data needs, design tools such as analytical reports and dashboards, and clearly communicate project progress.

You may be a good fit if you have:

5+ years of experience working in an AI focused data operations organization.
A proven track record designing and executing large scale data or evaluation projects, including gathering, labeling, and post-processing data.
The ability to analyze messy and complex data, identify overarching patterns, and distill your findings into crisp annotation guidelines or model quality reports.
Proficiency with Python, LLMs, or other popular industry tools for automation.
Excellent communication and project management skills, and the ability to support several projects simultaneously.
A foundational understanding of and interest in LLMs/VLMs and multimodal AI.
Conviction that data is the key ingredient for the performance and assessment of AI models.

You’ll stand out if you have:

Experience in data collection and labeling for multimodal language models.
Experience in red teaming, localization testing, or other evaluation focused fields.
Experience working with research scientists and engineers.
Expertise or interest in video-centric domains, such as sports, advertising, and content creation.

Tech Stack:

Development & Analysis: Python (primarily pandas, Jupyter, etc.)
Data Management & Visualization: Amazon S3, Various data visualization tools (framework-agnostic)
Project Management Tools: Linear, Notion

Even if there are a few checkboxes that aren’t ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs.

Benefits and Perks:

🤝 An open and inclusive culture and work environment.

🧑‍💻 Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

🦷 Full health, dental, and vision benefits.

✈️ Flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

Frequently Asked Questions

What is the salary for the AI Evaluation Program Manager role at twelve-labs?

The listed salary for this AI Evaluation Program Manager position at twelve-labs is USD 150K–160K/yr. This is a remote FullTime role.

Is the AI Evaluation Program Manager job at twelve-labs remote?

Yes, this AI Evaluation Program Manager position at twelve-labs is remote, with team members based in San Francisco. You can work from home or anywhere in the supported regions.

Is the AI Evaluation Program Manager role at twelve-labs full-time or part-time?

This is listed as a FullTime position. It is posted as a AI Evaluation Program Manager role in the Tech department at twelve-labs.

Which team or department does the AI Evaluation Program Manager at twelve-labs belong to?

This AI Evaluation Program Manager position is part of the Tech department at twelve-labs. See the full job description for more information about the team structure and responsibilities.

How do I apply for the AI Evaluation Program Manager position at twelve-labs?

Click the "Apply Now" button on this page. You will be redirected to twelve-labs's official application portal hosted on ashby where you can submit your application directly.

When was the AI Evaluation Program Manager job at twelve-labs posted?

This AI Evaluation Program Manager position at twelve-labs was posted on Jun 3, 2026. Apply as soon as possible — early applications are often reviewed first.

AI Evaluation Program Manager

twelve-labs · 💰 USD 150K–160K/yr

Apply for this role ↗

You'll be redirected to twelve-labs's official application page on Ashby ATS.