Member of Technical Staff - Data & ML Infra Engineer

moonlake· Engineering & Research
Apply Now ↗
📍 San Francisco, CAFullTime

About this role

Introducing Moonlake, AI for creating world simulations.

Overview

Moonlake is building the frontier of interactive world models: systems that generate, simulate, and reason over 3D environments for embodied AI, robotics and gaming. We develop the simulation infrastructure to build worlds (e.g., assets, scenes, digital twins) at scale.

Our team sits at the intersection of:

  • Embodied AI

  • Robotics simulation

  • Interactive 3D worlds

  • World models

  • Real-time generation

  • AI infrastructure

Moonlake is building the next generation of AI infrastructure for interactive digital worlds. Our mission is to enable anyone to create, simulate, and interact with rich environments using natural language and multimodal inputs, turning simple ideas into worlds with structure, logic, and agents that can perceive and act.

Our team has raised $28M in seed funding from NVIDIA Ventures, Threshold Ventures, AIX ventures and notable angels including Naval Ravikant and Jeff Dean to build the foundational layer for the future of AI - powering everything from creative tools and games to robotics training, simulations, and digital twins. Our goal is to make building and experimenting with these environments as accessible and scalable as publishing video on the internet.

We are looking for exceptional research engineers and applied researchers to help push the frontier of interactive AI.

The Role

We’re looking for a Member of Technical Staff — Data & ML Infrastructure Engineer to help build and optimize the systems that power Moonlake’s model training and inference infrastructure.

This role sits at the core of Moonlake’s platform and focuses on one mission:

Improve throughput, latency, and cost — deploying models 2–10× faster and cheaper without quality regressions.

You’ll work across GPU kernels, inference systems, distributed training, serving infrastructure, observability, and large-scale orchestration systems.

This is a highly technical systems role intended for engineers who enjoy operating at the intersection of:

  • ML systems

  • Distributed infrastructure

  • GPU optimization

  • Production AI deployment

  • Performance engineering

This role emerged directly from Moonlake’s need to better support large-scale world-model training and deployment infrastructure.

What You’ll Do

  • Optimize large-scale model training and inference systems

  • Improve GPU utilization, latency, throughput, and deployment efficiency

  • Build infrastructure that supports real-time world-model and multimodal workloads

  • Develop and optimize serving pipelines for frontier AI systems

  • Work closely with research teams to productionize high-performance models

  • Build scalable orchestration and observability systems for distributed AI infrastructure

  • Improve reliability, rollout safety, autoscaling, and production monitoring

  • Design systems that support fast experimentation without sacrificing stability

Scope of Work

GPU Performance Optimization

  • CUDA / Triton kernels

  • FlashAttention family

  • Paged attention

  • CUDA Graphs

  • Memory optimization

  • Kernel-level performance tuning

Model Serving & Inference

  • TensorRT-LLM

  • Triton Inference Server

  • vLLM / TGI

  • Continuous batching

  • On-GPU KV cache reuse

  • Speculative decoding / Medusa

  • Mixture-of-agents routing

Distributed Training & Parallelism

  • FSDP / ZeRO

  • Tensor parallelism

  • Pipeline parallelism

  • Expert parallelism

  • NCCL tuning

  • Multi-node GPU orchestration

Quantization & Efficient Fine-Tuning

  • AWQ / GPTQ / FP8

  • LoRA / DoRA serving

  • Efficient deployment pipelines

Infrastructure & Systems

  • Ray

  • Kubernetes

  • Argo

  • Autoscaling systems

  • Canary deployments & rollback infrastructure

  • A/B experimentation systems

  • Observability stack:

    • Prometheus

    • Grafana

    • OpenTelemetry

Why This Role Matters

Moonlake’s products require real-time, highly efficient AI infrastructure capable of powering interactive worlds and embodied intelligence systems at scale.

The difference between:

  • 200ms and 2s latency

  • 40% and 90% GPU utilization

  • Stable rollout and catastrophic regression

…directly impacts the company’s ability to train, deploy, and scale world-model systems.

You’ll help define the infrastructure foundation behind the next generation of interactive AI systems.

We are committed to being an on-site, in-person team currently based in San Francisco.

Frequently Asked Questions

Is the salary disclosed for the Member of Technical Staff - Data & ML Infra Engineer position at moonlake?
The salary for this Member of Technical Staff - Data & ML Infra Engineer role at moonlake is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Member of Technical Staff - Data & ML Infra Engineer position at moonlake located?
This Member of Technical Staff - Data & ML Infra Engineer role at moonlake is based in San Francisco, CA. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Member of Technical Staff - Data & ML Infra Engineer role at moonlake full-time or part-time?
This is listed as a FullTime position. It is posted as a Member of Technical Staff - Data & ML Infra Engineer role in the Engineering & Research department at moonlake.
Which team or department does the Member of Technical Staff - Data & ML Infra Engineer at moonlake belong to?
This Member of Technical Staff - Data & ML Infra Engineer position is part of the Engineering & Research department at moonlake. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Member of Technical Staff - Data & ML Infra Engineer position at moonlake?
Click the "Apply Now" button on this page. You will be redirected to moonlake's official application portal hosted on ashby where you can submit your application directly.
When was the Member of Technical Staff - Data & ML Infra Engineer job at moonlake posted?
This Member of Technical Staff - Data & ML Infra Engineer position at moonlake was posted on Sep 27, 2025. Apply as soon as possible — early applications are often reviewed first.
Member of Technical Staff - Data & ML Infra Engineer
moonlake
Apply for this role ↗

You'll be redirected to moonlake's official application page on Ashby ATS.