Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

AmpstekKuala Lumpur, Kuala Lumpur, Malaysia
14 hours ago
Job description

Ampstek Federal Territory of Kuala Lumpur, Malaysia

Site Reliability Engineer

Position Summary : We are looking for a skilled Site Reliability Engineer (SRE) to join our technology operations team. The ideal candidate will be responsible for building scalable, reliable, and high-performance systems while ensuring continuous uptime and operational excellence. The SRE will work closely with development, DevOps, and infrastructure teams to automate processes, enhance observability, and improve system resilience.

Key Responsibilities

  • Design, build, and maintain highly available and scalable infrastructure across cloud and on-premise environments.
  • Implement monitoring, alerting, and incident response systems using tools such as Prometheus, Grafana, ELK, or Splunk.
  • Automate deployment, scaling, and operations using Infrastructure-as-Code (IaC) tools like Terraform, Ansible, or CloudFormation.
  • Drive CI / CD pipeline enhancements and ensure seamless integration and deployment workflows (e.g., Jenkins, GitLab CI, or Azure DevOps).
  • Collaborate with development teams to improve system reliability, observability, and performance.
  • Troubleshoot production issues, perform root cause analysis (RCA), and implement long-term fixes.
  • Manage incident response and postmortems, reducing Mean Time To Recovery (MTTR).
  • Work with Kubernetes / Docker environments to support microservices and containerized deployments.
  • Ensure robust disaster recovery and backup strategies, along with adherence to security and compliance requirements.

Must-Have Skills

  • Strong experience as an SRE, DevOps Engineer, or Cloud Infrastructure Engineer in large-scale production environments.
  • Proficiency in Linux / Unix system administration and shell scripting.
  • Hands-on experience with cloud platforms (AWS, Azure, or GCP).
  • Expertise in containerization and orchestration tools such as Docker and Kubernetes
  • Experience with CI / CD tools (Jenkins, GitLab CI, or Azure DevOps).
  • Knowledge of Infrastructure-as-Code tools (Terraform, Ansible, or CloudFormation).
  • Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK Stack, Splunk, Datadog, or New Relic).
  • Experience in automating repetitive tasks using Python, Bash, or Go
  • Seniority level

  • Mid-Senior level
  • Employment type

  • Contract
  • Job function

  • Information Technology
  • Industries

  • IT Services and IT Consulting
  • Referrals increase your chances of interviewing at Ampstek by 2x

    #J-18808-Ljbffr

    Create a job alert for this search

    Site Reliability Engineer • Kuala Lumpur, Kuala Lumpur, Malaysia