Talent.com
This job offer is not available in your country.
Site Reliability Engineer

Site Reliability Engineer

Unison GroupKuala Lumpur, Federal Territory of Kuala Lumpur, MY
16 days ago
Job type
  • Quick Apply
Job description

As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and operations, ensuring robust, scalable, and responsive infrastructure. This role emphasizes strong system architecture and design principles, focusing on key SRE practices such as Service Level Objectives (SLOs), Service Level Indicators (SLIs), and the reduction of operational toil. You will collaborate closely with diverse teams to drive reliability improvements and foster a culture of continuous learning and accountability.

Requirements

Key Responsibilities :

Design and implement resilient system architectures that support high availability and scalability.

Develop automation tools and scripts to enhance operational efficiency and reduce manual effort.

Define, track, and analyze SLOs and SLIs to ensure reliability and performance meet business needs.

Conduct thorough post-mortem analyses following incidents, driving continuous improvement through root cause identification and solution implementation.

Collaborate with development and operations teams to establish best practices in system reliability and incident management.

Troubleshoot and resolve issues related to database performance, network connectivity, and deployment failures, including diagnosing problems at the underlying platform level (e.g., Kubernetes, virtual machines).

Ensure that issues are resolved within the stipulated Service Level Agreements (SLAs), maintaining high standards of service delivery.

Identify and troubleshoot performance bottlenecks across systems, providing actionable recommendations for enhancements.

Maintain detailed documentation of processes and incident responses to support knowledge sharing and compliance.

Qualifications :

Proficiency in programming languages such as Python, Golang, Java, or similar, focusing on operational efficiency.

Demonstrated experience in system architecture and design, prioritizing reliability, and scalability.

Strong understanding of SRE principles, including SLOs, SLIs, toil reduction, and incident post-mortems.

Experience with cloud environments (e.g., AWS, Azure, Google Cloud) and their operational management.

Strong expertise in Linux system administration.

Proven experience in troubleshooting application support issues with a focus on performance and connectivity.

Familiarity with networking concepts and effective troubleshooting techniques.

Excellent problem-solving abilities and a proactive approach to operational challenges.

Ability to work independently while effectively collaborating within a team environment.

Preferred Skills :

Familiarity with monitoring tools and performance optimization techniques.

Experience in scripting or automation for system administration tasks.

Knowledge of networking concepts and troubleshooting methodologies.

Hands-on knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and their services.

Familiarity with DevOps practices and frameworks, including CI / CD, infrastructure as code, and containerization.

Create a job alert for this search

Reliability Engineer • Kuala Lumpur, Federal Territory of Kuala Lumpur, MY

Related jobs
Site Engineer (Central)

Site Engineer (Central)

SolarvestPetaling Jaya, Selangor, MY
Quick Apply
Supervise and manage all on-site activities to ensure smooth project execution and adherence to timelines.Communicate effectively with clients and subcontractors to address project requirements and...Show moreLast updated: 30+ days ago
Sales Engineer (Lift / Elevator)

Sales Engineer (Lift / Elevator)

Sinergia Talents Sdn BhdPetaling Jaya, Selangor, Malaysia
Quick Apply
To support and assist the Executive Director & Sales Manager for the products marketing, sales activities and on-site support activities. Identifying and pursuing new business opportunities within t...Show moreLast updated: 30+ days ago
  • Promoted
Advanced Control Applications Engineer

Advanced Control Applications Engineer

ExxonMobilKuala Lumpur, Kuala Lumpur, Malaysia
Select how often (in days) to receive an alert : .At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future. As one of the world’s largest publicly tr...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineering Manager

Site Reliability Engineering Manager

CanonicalShah Alam, Malaysia
Site Reliability Engineering Manager.This role is based remotely in the APAC region.The Information Systems team at Canonical runs services used by over 60 million Ubuntu users.Our mission is to pi...Show moreLast updated: 8 days ago
DevOps / Site Reliability Engineer (Malaysia)

DevOps / Site Reliability Engineer (Malaysia)

InsiderSecurityKuala Lumpur, Federal Territory of Kuala Lumpur, MY
Quick Apply
Build automation for DevOps and be its advocate in the product teams.Build automation for high availability and robustness of our infrastructure. Monitor our infrastructure health to ensure high ava...Show moreLast updated: 30+ days ago
  • Promoted
(Chinese and English) Customer Support Consultant, crypto (remote)

(Chinese and English) Customer Support Consultant, crypto (remote)

SupportYourAppKepong, Kuala Lumpur, Malaysia
Passionate about the world of tech?.What if you had a chance to be a part of the world's leading SaaS, Software, or Hardware solutions?. Unlock your potential by mastering new skills and achieving c...Show moreLast updated: 16 days ago
  • Promoted
Assembly Operator

Assembly Operator

SandvikSeremban, Negeri Sembilan, Malaysia
Are you an experienced electrical or mechanic professional who is passionate to learn and develop within the role in a high-tech company with origin from Sweden? If the answer is yes, keep reading ...Show moreLast updated: 30+ days ago
Sr. Systems Engineer

Sr. Systems Engineer

Two95 International Inc.Kuala Lumpur, Federal Territory of Kuala Lumpur, MY
Quick Apply
To ensure successful implementation of projects within schedule.To ensure SLAs are met and achieved the highest customer satisfaction. Oversee the design, development and implementation of clients s...Show moreLast updated: 30+ days ago
  • Promoted
Senior Subsea Engineer

Senior Subsea Engineer

McDermottKuala Lumpur, Malaysia
The Senior Subsea Engineer uses best practices and knowledge of internal or external issues to improve the Subsea discipline within McDermott. They will act as a resource for colleagues with less ex...Show moreLast updated: 2 days ago
Site Engineer

Site Engineer

Agensi Pekerjaan Great Pyramid Sdn BhdPetaling Jaya, Selangor, Malaysia
Quick Apply
We are seeking a skilled and experienced.As a Site Engineer, you will be responsible for overseeing and managing construction projects involving post-tensioned and pre-stressed structures, ensuring...Show moreLast updated: 30+ days ago
Site Project Engineer

Site Project Engineer

SolarvestPetaling Jaya, Selangor, MY
Quick Apply
Conduct engineering site survey to plan equipment and interconnection locations at the site, and connection to the existing electrical system. Work with Design Department to confirm engineering desi...Show moreLast updated: 30+ days ago
  • Promoted
Deployment Engineer (f / m / d)

Deployment Engineer (f / m / d)

ng-voice GmbHKuala Lumpur, Malaysia
Born in a cloud-native era where software is at the centre of business, ng-voice has licensed source code and collaboration in its DNA. Committed to building the next G of 100% software-based mobile...Show moreLast updated: 8 days ago
  • Promoted
Reservoir Engineer

Reservoir Engineer

ExxonMobilKuala Lumpur, Kuala Lumpur, Malaysia
At ExxonMobil, our vision is to lead in energy innovations that advance modern living and a net-zero future.As one of the world’s largest publicly traded energy and chemical companies, we are power...Show moreLast updated: 30+ days ago
  • Promoted
Business Development Manager - remote role

Business Development Manager - remote role

The Access GroupKuala Selangor, Kuala Selangor, Malaysia
We're looking for people to join the Access family, who share our passion for believing in better, and who will help us continue to grow. We offer a blended approach to office working, encouraging y...Show moreLast updated: 16 days ago
  • Promoted
Service Reliability Engineer ( Multiple locations)

Service Reliability Engineer ( Multiple locations)

JobgetherSeremban, Malaysia
Service Reliability Engineer ( Multiple locations).Service Reliability Engineer ( Multiple locations).Jobgether is a Talent Matching Platform that partners with companies worldwide to efficiently c...Show moreLast updated: 2 days ago
Site Reliability Engineer (SRE) / Devops Engineer

Site Reliability Engineer (SRE) / Devops Engineer

Unison GroupKuala Lumpur, Federal Territory of Kuala Lumpur, MY
Quick Apply
As a Site Reliability Engineer (SRE), you will play a key role in maintaining the reliability and performance of critical services. Your expertise will help bridge the gap between development and op...Show moreLast updated: 16 days ago
Reliability Engineer (Machinery)

Reliability Engineer (Machinery)

Petron MalaysiaPort Dickson, Negeri Sembilan, MY
Quick Apply
Petron Malaysia is an emerging and rapidly evolving Asian oil company.It is part of Petron Corporation which is the leading oil company in the Philippines. Our integrated refining, distribution, and...Show moreLast updated: 30+ days ago
  • Promoted
Engineer I, Remote Care Operations

Engineer I, Remote Care Operations

AbbottSepang, Malaysia
Under general supervision, responsible for designing new products and processes and improving and maintaining existing products. May execute less complex projects.May conduct design analysis on comp...Show moreLast updated: 2 days ago