DevOps / SRE Engineer - AI Platform
1cWkogd8purSPWZfcuHmCe· Technology
About this role
The DevOps / SRE Engineer owns the operational substrate of an AI-native retail decisioning platform — infrastructure, CI / CD, observability, cost meter, and incident response for a system that runs production agents taking real business actions. The role builds on the enterprise Terraform standard, CI / CD spine, and FinOps tagging policy rather than reinventing parallel infrastructure.
Remote candidates outside of Thailand are welcome to apply.
Key Responsibilities:
- Adopt the enterprise Terraform standard and module library for all platform infrastructure; author platform-specific modules where needed (agent runtime, vector DB, knowledge graph); run drift detection weekly.
- Build platform-specific CI / CD pipelines on the enterprise spine — service deploys, agent deploys, eval-gate enforcement; integrate eval gates so no agent reaches production without eval pass.
- Operate rollback orchestration with sub-15-minute recovery; quarterly game days.
- Own the platform observability stack — OpenTelemetry, Langfuse for LLM traces, custom dashboards for per-agent cost.
- Implement the per-agent cost meter end-to-end — token counts, vector queries, model inference, downstream LLM Gateway costs; surface cost data to the enterprise GenAI cost dashboard.
- Stand up the platform on-call rotation; author runbooks for every production agent and service; lead incident response with measurable corrective actions.
- Implement platform cost-tagging policy consistent with the enterprise standard (team, domain, environment, project, agent, suite, persona); report monthly to Cost Review.
- Drive cost optimisation — right-sizing, caching, model routing decisions, reserved compute.
- Bachelor's or Master's degree in Computer Science, Engineering, or a related discipline.
- 5+ years SRE / DevOps with production ownership.
- Terraform at scale — modules, state, drift, environment promotion.
- CI / CD for data + ML / AI services (GitLab CI / CD or comparable).
- Cloud platform (Azure preferred; AWS / GCP transferable).
- Observability — OpenTelemetry, Langfuse (or comparable LLM traces), custom dashboards.
- FinOps — tagging policies, attribution, optimisation.
- Incident response — on-call, post-mortems, runbook authorship.
Preferred Qualifications
- AI / agent platform SRE experience; cost-meter / chargeback systems built or operated.
- Multi-cloud production experience; open-source contributions to IaC / observability tooling.
- AI / ML / agent system observability instrumentation (LLM cost, agent cost, eval scores).
- Vendor certifications such as HashiCorp Terraform Associate / Professional, Azure Solutions Architect Associate, or Databricks Data Engineer Professional.
Frequently Asked Questions
Is the salary disclosed for the DevOps / SRE Engineer - AI Platform position at 1cWkogd8purSPWZfcuHmCe?
The salary for this DevOps / SRE Engineer - AI Platform role at 1cWkogd8purSPWZfcuHmCe is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the DevOps / SRE Engineer - AI Platform position at 1cWkogd8purSPWZfcuHmCe located?
This DevOps / SRE Engineer - AI Platform role at 1cWkogd8purSPWZfcuHmCe is based in Nawamin Road, Bangkok, Thailand. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the DevOps / SRE Engineer - AI Platform role at 1cWkogd8purSPWZfcuHmCe full-time or part-time?
This is listed as a Full time position. It is posted as a DevOps / SRE Engineer - AI Platform role in the Technology department at 1cWkogd8purSPWZfcuHmCe.
Which team or department does the DevOps / SRE Engineer - AI Platform at 1cWkogd8purSPWZfcuHmCe belong to?
This DevOps / SRE Engineer - AI Platform position is part of the Technology department at 1cWkogd8purSPWZfcuHmCe. See the full job description for more information about the team structure and responsibilities.
How do I apply for the DevOps / SRE Engineer - AI Platform position at 1cWkogd8purSPWZfcuHmCe?
Click the "Apply Now" button on this page. You will be redirected to 1cWkogd8purSPWZfcuHmCe's official application portal hosted on workable where you can submit your application directly.
When was the DevOps / SRE Engineer - AI Platform job at 1cWkogd8purSPWZfcuHmCe posted?
This DevOps / SRE Engineer - AI Platform position at 1cWkogd8purSPWZfcuHmCe was posted on May 21, 2026. Apply as soon as possible — early applications are often reviewed first.
DevOps / SRE Engineer - AI Platform
1cWkogd8purSPWZfcuHmCe
You'll be redirected to 1cWkogd8purSPWZfcuHmCe's official application page on workable.