Sr. Site Reliability Engineer

st-labs· Engineering

📍 NYC, NYFullTime🗓 Posted Mar 19, 2026

About this role

Standard Template Labs is an AI-native startup reimagining the future of IT Service and Configuration Management. Backed by leading investors, we're leveraging AI to transform how enterprises manage and engage with their IT ecosystems.

About the Role

We’re looking for a Senior Site Reliability Engineer (SRE) to own the reliability, performance, and scalability of our AI-native platform. You’ll operate at the intersection of software engineering and infrastructure, building systems that keep our platform highly available, observable, and resilient in production.

This is a hands-on engineering role where you’ll write production code (primarily in Python) while also owning on-call operations and incident response.

Responsibilities

Reliability & Production Ownership

Own the availability, latency, and performance of critical production systems
Participate in and improve a 24/7 on-call rotation, responding to incidents and driving resolution
Lead incident response, root cause analysis (RCA), and postmortems
Design systems that fail gracefully and recover automatically

Automation & Engineering (Python-heavy)

Write production-grade Python code to:
- Automate infrastructure workflows
- Build internal reliability tools
- Improve deployment, rollback, and recovery systems
Eliminate manual operational work through automation and self-healing systems

Observability & Monitoring

Design and implement:
- Metrics, logging, tracing
- Alerting systems (reduce noise, improve signal)
Build dashboards and tooling to give real-time visibility into system health

Infrastructure & Scalability

Operate and improve systems running on:
- Cloud platforms (AWS/GCP/Azure)
- Containers (Docker, Kubernetes)
Scale systems to handle enterprise workloads and high-throughput traffic
Improve deployment pipelines, CI/CD, and infrastructure-as-code

Reliability Engineering & Resilience

Define and enforce:
- SLAs / SLOs / error budgets
Conduct:
- Load testing
- Chaos testing
Build resilient systems that can tolerate failure

Collaboration

Partner with product and backend engineers to:
- Improve system reliability
- Embed observability into services
Help teams design production-ready systems from day one

Qualifications

Core Requirements

Strong software engineering background (not just ops)
Proficiency in Python (required) for building tools and services
Experience operating production systems at scale

Infrastructure & Systems

Experience with:
- Kubernetes / Docker
- Cloud platforms (AWS/GCP/Azure)
- Distributed systems

Reliability & Operations

Experience with:
- On-call rotations and incident response
- Monitoring tools (Grafana, Prometheus, etc.)
- Debugging production issues under pressure

Nice to Have

Experience with:
- AI/ML systems or data pipelines
- Event-driven architectures
- High-availability systems

What We Offer

Build foundational product features for an AI-first enterprise platform
The opportunity to take ownership of critical systems that scale to millions of users
A culture that values craftsmanship, autonomy, and technical excellence
Competitive compensation, equity, and benefits package
Work from our Flatiron District, Manhattan office, where you’ll be side-by-side with the founding team in a supportive, collaborative setting. Our team works on-site five days a week, growing and building together, and the location is easy to reach with plenty of public transportation options.

As an equal opportunity employer, we don’t tolerate discrimination or harassment of any kind. Whether that’s based on race, ethnicity, age, gender identity, citizenship, religion, sexual orientation, disability, pregnancy, veteran status or any other protected characteristic as outlined by federal, state or local laws. The reasonably estimated yearly salary for this role at is: $160,000—$250,000 USD.

Frequently Asked Questions

Is the salary disclosed for the Sr. Site Reliability Engineer position at st-labs?

The salary for this Sr. Site Reliability Engineer role at st-labs is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.

Where is the Sr. Site Reliability Engineer position at st-labs located?

This Sr. Site Reliability Engineer role at st-labs is based in NYC, NY. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.

Is the Sr. Site Reliability Engineer role at st-labs full-time or part-time?

This is listed as a FullTime position. It is posted as a Sr. Site Reliability Engineer role in the Engineering department at st-labs.

Which team or department does the Sr. Site Reliability Engineer at st-labs belong to?

This Sr. Site Reliability Engineer position is part of the Engineering department at st-labs. See the full job description for more information about the team structure and responsibilities.

How do I apply for the Sr. Site Reliability Engineer position at st-labs?

Click the "Apply Now" button on this page. You will be redirected to st-labs's official application portal hosted on ashby where you can submit your application directly.

When was the Sr. Site Reliability Engineer job at st-labs posted?

This Sr. Site Reliability Engineer position at st-labs was posted on Mar 19, 2026. Apply as soon as possible — early applications are often reviewed first.

Sr. Site Reliability Engineer

st-labs

Apply for this role ↗

You'll be redirected to st-labs's official application page on Ashby ATS.