Member of Technical Staff - ML Infrastructure & Performance

embedding-vcยท Moonlake
Apply Now โ†—
๐Ÿ“ San Mateo, CAFullTime

About this role

Introducing Moonlake, AI for creating real-time interactive content

Mission: Improve Throughput, Latency, & Cost - deploying our models 2โ€“10ร— faster & cheaper without quality regressions.

Scope of Work:

- GPU performance: CUDA/Triton kernels, FlashAttention family, paged attention, CUDA Graphs.

- Serving stack: TensorRT-LLM/Triton Inference Server, vLLM/TGI; continuous batching; on-GPU KV reuse; speculative decoding/medusa; mixture-of-agents routing.

- Parallelism: FSDP/ZeRO, TP/PP/expert parallel; NCCL tuning.

- Quantization/PEFT: AWQ/GPTQ/FP8; LoRA/DoRA serving.

- Systems: Ray/k8s/Argo, observability (Prom/Grafana/OpenTelemetry), autoscaling, A/B infra, canary + rollback.

Tech signals:

Previous experience at Infra-heavy startups such as Databricks, Roblox

We are committed to being an on-site, in-person team currently based in San Mateo

Frequently Asked Questions

Is the salary disclosed for the Member of Technical Staff - ML Infrastructure & Performance position at embedding-vc?
The salary for this Member of Technical Staff - ML Infrastructure & Performance role at embedding-vc is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Member of Technical Staff - ML Infrastructure & Performance position at embedding-vc located?
This Member of Technical Staff - ML Infrastructure & Performance role at embedding-vc is based in San Mateo, CA. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Member of Technical Staff - ML Infrastructure & Performance role at embedding-vc full-time or part-time?
This is listed as a FullTime position. It is posted as a Member of Technical Staff - ML Infrastructure & Performance role in the Moonlake department at embedding-vc.
Which team or department does the Member of Technical Staff - ML Infrastructure & Performance at embedding-vc belong to?
This Member of Technical Staff - ML Infrastructure & Performance position is part of the Moonlake department at embedding-vc. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Member of Technical Staff - ML Infrastructure & Performance position at embedding-vc?
Click the "Apply Now" button on this page. You will be redirected to embedding-vc's official application portal hosted on ashby where you can submit your application directly.
When was the Member of Technical Staff - ML Infrastructure & Performance job at embedding-vc posted?
This Member of Technical Staff - ML Infrastructure & Performance position at embedding-vc was posted on Dec 12, 2025. Apply as soon as possible โ€” early applications are often reviewed first.
Member of Technical Staff - ML Infrastructure & Performance
embedding-vc
Apply for this role โ†—

You'll be redirected to embedding-vc's official application page on Ashby ATS.