Talent.com
Site Reliability Engineer - Incident Commander

Site Reliability Engineer - Incident Commander

Siemens AGBayan Lepas, Penang, Malaysia
21 jam yang lalu
Penerangan pekerjaan

Job Family : Software

Req ID :

Siemens Digital Industries Software is a leading provider of solutions for the design, simulation, and manufacture of products across many different industries. Formula 1 cars, skyscrapers, ships, space exploration vehicles, and many of the objects we see in our daily lives are being conceived and manufactured using our Product Lifecycle Management (PLM) software.

Are you ready to make a tangible impact on critical cloud-based applications in a dynamic and collaborative environment? Join our organization, where you will be at the forefront of enhancing service and application availability, optimizing processes through innovative automation, and solving complex technical challenges.

In this crucial role, you will develop cutting‑edge automated solutions that support and sustain our best‑in‑class cloud infrastructure, particularly for the vital Siemens Xcelerator platform. When incidents arise, you will coordinate major incident response, ensuring rapid resolution and seamless communication with our partners during service‑impacting events, all while upholding our strict Service Level Agreements (SLAs). Your exceptional communication and coordination skills will be paramount, as you will directly contribute to our product teams consistently meeting their commitments and driving overall platform reliability.

Key Responsibilities

  • Incident Management : Act as the primary point of contact and leader during major incidents, coordinating the response, communication, and resolution efforts across all involved teams.
  • Incident Response : Quickly assess the severity of incidents, determine the impact, and drive the appropriate response to restore services as quickly as possible.
  • Communication : Ensure clear, concise, and timely communication with stakeholders, including technical teams, management, and customers, throughout the incident lifecycle.
  • Post‑Incident Analysis : Lead post‑incident reviews to identify root causes, drive improvements, and implement preventive measures to reduce the likelihood of recurrence.
  • Collaboration : Work closely with SRE, DevOps, Development, and other relevant teams to ensure that incident management processes are well‑defined and continuously improved.
  • Training & Preparedness : Conduct regular incident response drills, train teams on incident management processes, and ensure readiness for handling high‑severity incidents.
  • Documentation : Maintain and update incident management documentation, ensuring that all procedures are up‑to‑date and accessible to all relevant teams.
  • Monitoring & Alerts : Collaborate with SRE and monitoring teams to define and refine alerting criteria, ensuring that incidents are detected and escalated promptly.
  • Continuous Improvement : Find opportunities to improve system reliability, scalability, and performance based on lessons learned from incidents.
  • 24x7 On‑call rotation : Participate in 24x7 on‑call rotation.

Qualifications

  • Technical Expertise : Familiar with cloud infrastructure (AWS, GCP, Azure), containerization (Docker, Kubernetes), and automation scripting (Python, Bash).
  • Incident Management Tools : Familiarity with incident management platforms (e.g., Jira Service Management, ServiceNow), monitoring tools (e.g., Datadog, Grafana), and on‑call systems (e.g., PagerDuty).
  • Incident Response & Resolution : Proven ability to rapidly assess, troubleshoot, and resolve complex incidents in distributed enterprise IT environments, ensuring quick service restoration while remaining calm under pressure.
  • Leadership & Stakeholder Management : Demonstrated leadership in incident response, effectively managing cross‑functional teams and aligning with business and product stakeholders.
  • Communication : Outstanding English communication skills, both verbal and written, including strong listening and synthesis abilities.
  • Metrics & Continuous Improvement : Skilled in defining, tracking, and utilizing incident metrics (e.g., MTTR, MTTD) to drive accountability and continuous improvement.
  • Problem‑Solving : Excellent troubleshooting and problem‑solving skills, with the ability to quickly analyze complex systems.
  • Proactive Learning & Availability : Highly motivated to continuously learn new technologies and adapt to evolving trends, with availability to work required core hours.
  • Nice to have : Relevant certifications (e.g., AWS Certified Solutions Architect, Certified Kubernetes Administrator)

    We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status.

    We are Siemens

    A collection of over 377,000 minds building the future, one day at a time in over 200 countries. We're dedicated to equality, and we welcome applications that reflect the diversity of the communities we work in. All employment decisions at Siemens are based on qualifications, merit, and business need. Bring your curiosity and creativity and help us shape tomorrow!

    We offer a comprehensive reward package which includes a competitive basic salary, bonus scheme, generous holiday allowance, pension, and private healthcare.

    Transform the everyday

    Organization : Digital Industries

    Job Type : Full-time

    Category : Information Technology

    #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Site Reliability Engineer • Bayan Lepas, Penang, Malaysia

    Pekerjaan yang berkaitan
    • Dinaikkan pangkat
    Senior Site Reliability Engineer

    Senior Site Reliability Engineer

    CanonicalBatu Kawan, Penang, Malaysia
    Canonical Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia.Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our...Tunjukkan lagiKemas kini terakhir: 7 hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Systems Reliability Engineer

    Systems Reliability Engineer

    Serve RoboticsPenangMalaysia, Penang, Malaysia
    At Serve Robotics, we’re reimagining how things move in cities.Our personable sidewalk robot is our vision for the future. It’s designed to take deliveries away from congested streets, make deliveri...Tunjukkan lagiKemas kini terakhir: 21 jam yang lalu
    • Dinaikkan pangkat
    Lead Engineer, Reliability

    Lead Engineer, Reliability

    Celestica Inc.Bayan Lepas, Penang, Malaysia
    Press Tab to Move to Skip to Content Link.Select how often (in days) to receive an alert : .Design Engineering Systems Applications. Performs tasks such as, but not limited to, the following : .Design, ...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    QLTY & RELIABILITY ENGINEER II

    QLTY & RELIABILITY ENGINEER II

    TE Connectivity CorporationSeberang Perai, Penang, Malaysia
    Leads reviews and strategies to ensure that all products meet quality standards.Implement quality policies, procedures and systems. maintain / update quality engineering work instruction in order to ...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Lead Engineer - Reliability, Hardware Design (Storage / Server / Networking)

    Lead Engineer - Reliability, Hardware Design (Storage / Server / Networking)

    Celestica Inc.George Town, Penang, Malaysia
    Lead Engineer - Reliability, Hardware Design (Storage / Server / Networking).Performs tasks such as, but not limited to, the following : . Responsible for reliability test validation of IT equipment inc...Tunjukkan lagiKemas kini terakhir: 25 hari yang lalu
    • Dinaikkan pangkat
    Continuous Improvement Leader

    Continuous Improvement Leader

    Plexus Malaysia Sdn BhdPenangMalaysia, Penang, Malaysia
    Continuous Improvement Leader page is loaded## Continuous Improvement Leaderlocations : Penang, Malaysiatime type : Full timeposted on : Posted 30+ Days Agojob requisition id : R • • • •Purpose St...Tunjukkan lagiKemas kini terakhir: 18 hari yang lalu
    • Dinaikkan pangkat
    Site Reliability Engineer (Mandarin Speaker) - PJ / Penang / JB

    Site Reliability Engineer (Mandarin Speaker) - PJ / Penang / JB

    DevopshuntPenangMalaysia, Penang, Malaysia
    We are looking for highly motivated and results-oriented SREs with a strong foundation in SRE principles and a passion for building and maintaining reliable systems. The ideal candidates will posses...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Site Reliability Engineer

    Site Reliability Engineer

    CanonicalBayan Lepas, Penang, Malaysia
    Site Reliability Engineer role at Canonical.We deploy and run OpenStack, Kubernetes, storage solutions, and open source applications, applying DevOps practices. To succeed in this role, you need to ...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Site Reliability / Gitops Engineer

    Site Reliability / Gitops Engineer

    CanonicalKulim, Kedah, Malaysia
    Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is widely used in breakthrough enterprise initiat...Tunjukkan lagiKemas kini terakhir: 29 hari yang lalu
    • Dinaikkan pangkat
    Senior Lead Engineer - Reliability, Hardware Design (Storage / Server / Network)

    Senior Lead Engineer - Reliability, Hardware Design (Storage / Server / Network)

    Celestica Inc.George Town, Penang, Malaysia
    Design Engineering Systems Applications.Responsible for reliability test validation of IT equipment including server / storage / switch product. Work with Development team to ensure the products can mee...Tunjukkan lagiKemas kini terakhir: 25 hari yang lalu
    • Dinaikkan pangkat
    Web3 Senior Security Engineer

    Web3 Senior Security Engineer

    Hyphen ConnectSeberang Perai, Penang, Malaysia
    We are working with a decentralised exchange which looks to innovate on providing the best of CEXs and DEXs, focusing on building a safe, simple and scalable platform for trading.They differentiate...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Engineer I, Reliability

    Engineer I, Reliability

    Analog DevicesPenangMalaysia, Penang, Malaysia
    Analog Devices (ADI) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technol...Tunjukkan lagiKemas kini terakhir: 22 hari yang lalu
    • Dinaikkan pangkat
    SUPV QLTY & RELIABILITY ENGINEERING

    SUPV QLTY & RELIABILITY ENGINEERING

    TE ConnectivityPenangMalaysia, Penang, Malaysia
    SUPV QLTY & RELIABILITY ENGINEERING.Be among the first 25 applicants.SUPV QLTY & RELIABILITY ENGINEERING.Get AI-powered advice on this job and more exclusive features. At TE, you will unleash your p...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    STAFF QLTY & RELIABILITY ENGINEER

    STAFF QLTY & RELIABILITY ENGINEER

    TE Connectivity CorporationSeberang Perai, Penang, Malaysia
    TE Connectivity’s Quality and Reliability Engineering Teams analyze the ability of product and production systems to comply with customer and contractual requirements through established reliabilit...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Site Reliability Engineer - Incident Commander

    Site Reliability Engineer - Incident Commander

    Siemens Digital Industries SoftwareBayan Lepas, Penang, Malaysia
    Site Reliability Engineer - Incident Commander.Site Reliability Engineer - Incident Commander.Siemens Digital Industries Software. Siemens Digital Industries Software is a leading provider of soluti...Tunjukkan lagiKemas kini terakhir: 11 hari yang lalu
    • Dinaikkan pangkat
    Site Reliability Engineer - Incident Commander

    Site Reliability Engineer - Incident Commander

    Siemens MobilityBayan Lepas, Penang, Malaysia
    Site Reliability Engineer - Incident Commander.Siemens Digital Industries Software.Formula 1 cars, skyscrapers, ships, space exploration vehicles, and many of the objects we see in our daily lives ...Tunjukkan lagiKemas kini terakhir: 18 hari yang lalu
    • Dinaikkan pangkat
    SR QLTY & RELIABILITY ENGINEER

    SR QLTY & RELIABILITY ENGINEER

    TE ConnectivityPenangMalaysia, Penang, Malaysia
    SR QLTY & RELIABILITY ENGINEER.Implement quality policies, procedures and systems; maintain / update quality engineering work instructions in order to ensure the business procedures align with QS and...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    System Solution Applications Eng

    System Solution Applications Eng

    Lattice SemiconductorPenangMalaysia, Penang, Malaysia
    System Solution Applications Eng.There is energy here…energy you can feel crackling at any of our international locations. It’s an energy generated by enthusiasm for our work, for our teams, for our...Tunjukkan lagiKemas kini terakhir: 26 hari yang lalu