Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

GlintsMalaysia
13 hours ago
Job description

About the Role

We are seeking a

Site Reliability Engineer (SRE)

to join our client's team in Malaysia. You will be responsible for maintaining the

stability, scalability, and reliability

of critical applications and infrastructure, while driving automation and performance optimization.

This position requires Mandarin fluency, as you will collaborate with Mandarin-speaking stakeholders. The role is open exclusively to

Malaysian Citizens or Permanent Residents (PR).

It's a great opportunity for those with a strong background in Python (preferred) or Java / Golang with Linux scripting to advance their SRE career.

Key Responsibilities

  • Monitor and maintain

system performance, reliability, and uptime

  • Design and implement
  • scalable, resilient system architectures

  • Develop
  • automation tools / scripts

    to reduce manual work.

  • Define, track, and analyze
  • SLOs and SLIs

    to measure system reliability.

  • Troubleshoot issues across
  • databases, networks, and deployments (incl. Kubernetes)

  • Conduct
  • incident post-mortems

    and drive continuous improvement.

  • Collaborate with DevOps / engineering teams to establish
  • best practices

  • Participate in
  • on-call rotations

    and respond to critical issues.

    What We're Looking For

    Minimum

    1.5 years of relevant experience

    Fluent in

    English & Mandarin (mandatory)

    Strong skills in

    Python

    (preferred); if not,

    Java or Golang + Linux scripting & Bash

    Hands-on experience with

    cloud platforms

    (AWS, Azure, or GCP)

    Proficiency in

    Linux administration

    and troubleshooting

    Familiarity with

    SRE concepts

    (SLIs, SLOs, toil reduction, incident management)

    Comfortable working on

    rotational shifts

    Nice to Have

  • Experience with
  • Kubernetes, containers, CI / CD, Infrastructure as Code

  • Knowledge of
  • monitoring tools

    and performance optimization

    Create a job alert for this search

    Reliability Engineer • Malaysia