Engineer, Supercomputing & Distributed Systems

krea· Engineering
Apply Now ↗
📍 San FranciscoFullTime

About this role

About Krea

At Krea, we are building next-generation AI creative tools.

We are dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it.

We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium.

Supercomputing / AI Infra at Krea

We build and operate the infrastructure for Krea's research and inference. Distributed training, 1000+ K8s GPU clusters, petabyte scale data pipelines, etc. We build a lot of this from scratch — custom distributed datastores, job orchestration systems, and streaming pipelines that replace tools like Kafka and Ray for modern AI workloads at scale.

Example projects:

Distributed data systems

  • Design multi-stage pipelines that turn petabytes of raw data into clean, annotated datasets

  • Run classification models on billions of images

  • Deploy and combine LLMs to caption massive multimedia data

GPU infrastructure

  • Manage distributed training and inference on 1000+ GPU Kubernetes clusters

  • Solve orchestration and scaling for large-scale GPU job processing

  • Scale workloads and research between clusters in multiple datacenters

Distributed training

  • Profile and optimize dataloaders streaming thousands of images per second

  • Profile and debug InfiniBand networking on huge training runs

  • Build fault tolerance systems for large-scale pretraining

  • Collaborate with researchers on evolving RL infrastructure

Applied ML pipelines

  • Find clean scenes in millions of videos using distributed shot-boundary detection

  • Customize and train models to filter billions of images for questions like "is this a screenshot?"

  • Build the systems that bridge raw cluster capacity and research output

Who we're looking for:

Systems people. If you've read a blog post about InfiniBand debugging or building a custom distributed database and thought "I want to do that" — this is that team.

You'll spend your time working heavily with Python, Kubernetes, Torch, and data tools like DuckDB, Arrow, etc. It's OK if you don't have K8s or ML experience — the main thing we hire for is an intuition for distributed systems, and a great mental model of how systems interact and function under different conditions.

Strong candidates may have experience with…

  • Python, PyArrow, DuckDB, SQL, massive relational databases, PyTorch, Pandas, NumPy…

  • Kubernetes

  • Designing and implementing large-scale ETL systems

  • Fundamental knowledge of containerization, operating systems, file-systems, and networking

  • Distributed systems design

  • Distributed training systems (NCCL, InfiniBand, RDMA)

  • Streaming and event processing systems (Kafka, Pulsar, or similar)

  • PyTorch internals, custom dataloaders, and training infrastructure

Frequently Asked Questions

Is the salary disclosed for the Engineer, Supercomputing & Distributed Systems position at krea?
The salary for this Engineer, Supercomputing & Distributed Systems role at krea is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Engineer, Supercomputing & Distributed Systems position at krea located?
This Engineer, Supercomputing & Distributed Systems role at krea is based in San Francisco. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Engineer, Supercomputing & Distributed Systems role at krea full-time or part-time?
This is listed as a FullTime position. It is posted as a Engineer, Supercomputing & Distributed Systems role in the Engineering department at krea.
Which team or department does the Engineer, Supercomputing & Distributed Systems at krea belong to?
This Engineer, Supercomputing & Distributed Systems position is part of the Engineering department at krea. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Engineer, Supercomputing & Distributed Systems position at krea?
Click the "Apply Now" button on this page. You will be redirected to krea's official application portal hosted on ashby where you can submit your application directly.
When was the Engineer, Supercomputing & Distributed Systems job at krea posted?
This Engineer, Supercomputing & Distributed Systems position at krea was posted on Apr 3, 2026. Apply as soon as possible — early applications are often reviewed first.
Engineer, Supercomputing & Distributed Systems
krea
Apply for this role ↗

You'll be redirected to krea's official application page on Ashby ATS.