Senior / Staff Site Reliability, Platform Engineering

saviynt· Platform Upgrade
Apply Now ↗
📍 AtlantaFull Time

About this role

About Saviynt   Saviynt is a leader in identity security, delivering an AI-powered platform that governs and secures access to applications, data, and business processes for global enterprises and government institutions. Built for the AI era, Saviynt helps organizations move faster—securely and compliantly.     Why This Role Matters   Saviynt’s SaaS platform runs on complex, distributed, cloud-native systems. As a Staff Platform Engineer, you will play a critical role in ensuring these systems remain highly available, scalable, and secure as the company grows.   This is a hands-on engineering and technical leadership role. You will own reliability for major platform domains, design scalable solutions on Kubernetes and AWS, and drive automation and reliability improvements across multiple teams.     What You’ll Do   In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on   You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi-cloud environment   Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications   Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers   Develop robust, internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang)   Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams   Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize   Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams   Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance   Manage and optimize our shared infrastructure across Multi-Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers   Establish and enhance centralized Observability and Monitoring platforms and tools that provide self-service insights for consuming teams   Define and implement clear, well-documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients   Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services   Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use   Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support   Participate in on-call rotations to support the critical shared infrastructure you build     What We’re Looking For   6+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers   Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)   Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation   Extensive hands-on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi-cloud experience is a strong plus, especially in building abstractions over them   Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services   Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams   Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components   Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform   Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure   Strong experience with RESTful API design principles and building well-documented, consumable APIs   Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context   Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service   Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences   A strong customer-centric mindset, treating internal development teams as your primary customers   Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required     Why Join Saviynt   •        Work on a large-scale, cloud-native SaaS platform •        Solve complex reliability challenges at scale •        Influence platform architecture and engineering practices •        Competitive compensation, benefits, and career growth     Security & Compliance   This role requires adherence to Saviynt’s information security and privacy policies, including annual security training.

Frequently Asked Questions

Is the salary disclosed for the Senior / Staff Site Reliability, Platform Engineering position at saviynt?
The salary for this Senior / Staff Site Reliability, Platform Engineering role at saviynt is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Senior / Staff Site Reliability, Platform Engineering position at saviynt located?
This Senior / Staff Site Reliability, Platform Engineering role at saviynt is based in Atlanta. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Senior / Staff Site Reliability, Platform Engineering role at saviynt full-time or part-time?
This is listed as a Full Time position. It is posted as a Senior / Staff Site Reliability, Platform Engineering role in the Platform Upgrade department at saviynt.
Which team or department does the Senior / Staff Site Reliability, Platform Engineering at saviynt belong to?
This Senior / Staff Site Reliability, Platform Engineering position is part of the Platform Upgrade department at saviynt. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Senior / Staff Site Reliability, Platform Engineering position at saviynt?
Click the "Apply Now" button on this page. You will be redirected to saviynt's official application portal hosted on lever where you can submit your application directly.
When was the Senior / Staff Site Reliability, Platform Engineering job at saviynt posted?
This Senior / Staff Site Reliability, Platform Engineering position at saviynt was posted on Feb 18, 2026. Apply as soon as possible — early applications are often reviewed first.
Senior / Staff Site Reliability, Platform Engineering
saviynt
Apply for this role ↗

You'll be redirected to saviynt's official application page on Lever.