Site Reliability Engineer

engflow· Product Engineering
Apply Now ↗
🌍 Remote📍 Austin📍 New York📍 San FranciscoFullTime

About this role

About EngFlow

At EngFlow, we help developers save time by accelerating software builds and tests. Our cloud-based, distributed service optimizes developer workflows through remote execution and caching, improving efficiency, productivity, and product quality.

Backed by top investors, EngFlow is redefining how companies build software and ship well-tested products. Our solutions speed up builds by a factor of 10 or more, while our observability platform provides actionable insights for optimization. Founded by key contributors to Bazel, we build tools that empower engineering teams—from startups to Fortune 500 companies—to enhance developer velocity and improve build performance.

Learn more about our mission, culture, and team: EngFlow | Video

We’re looking for an experienced SRE to join our engineering team. You’ll be at the intersection of software engineering and systems operations — ensuring our distributed infrastructure is highly available, performant, and scalable while enabling our engineers to move quickly and confidently.

Key Responsibilities

  • Design, build, and maintain cloud infrastructure for our distributed build acceleration platform

  • Automate everything: from deployment pipelines to monitoring and recovery

  • Manage scalability and reliability for high-throughput, low-latency systems

  • Implement and maintain observability: logging, metrics, tracing, and alerting

  • Work closely with product and engineering teams to embed reliability into every feature

  • Diagnose and resolve production incidents quickly, and feed learnings back into systems design

  • Optimize cost, performance, and resilience across multi-cloud environments

Requirements

  • 4+ years in SRE, DevOps, or Production Engineering roles

  • Experience managing Kubernetes in production

  • Strong background in cloud infrastructure (GCP or AWS) and IaC (Terraform preferred)

  • Solid knowledge of networking, security, and distributed systems

  • Track record of improving system availability and developer productivity

  • A knack for debugging complex, cross-system issues under pressure

Benefits

We offer comprehensive medical, dental, vision benefits, 401k/pension, parental leave and generous vacation. The team is fully remote but we enjoy meeting together several times a year at exciting destinations throughout the world. We value getting the work done and having fun while doing it, and have done numerous fun team events such as chocolate, whisky, and tea tastings, monthly team games, escape the room, and other fun events.

Frequently Asked Questions

Is the salary disclosed for the Site Reliability Engineer position at engflow?
The salary for this Site Reliability Engineer role at engflow is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Is the Site Reliability Engineer job at engflow remote?
Yes, this Site Reliability Engineer position at engflow is remote, with team members based in Austin, New York, San Francisco. You can work from home or anywhere in the supported regions.
Is the Site Reliability Engineer role at engflow full-time or part-time?
This is listed as a FullTime position. It is posted as a Site Reliability Engineer role in the Product Engineering department at engflow.
Which team or department does the Site Reliability Engineer at engflow belong to?
This Site Reliability Engineer position is part of the Product Engineering department at engflow. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Site Reliability Engineer position at engflow?
Click the "Apply Now" button on this page. You will be redirected to engflow's official application portal hosted on ashby where you can submit your application directly.
When was the Site Reliability Engineer job at engflow posted?
This Site Reliability Engineer position at engflow was posted on Jan 27, 2026. Apply as soon as possible — early applications are often reviewed first.
Site Reliability Engineer
engflow
Apply for this role ↗

You'll be redirected to engflow's official application page on Ashby ATS.