Talent.com
Tawaran kerja ini tidak tersedia di negara anda.
Senior Site Reliability Engineer

Senior Site Reliability Engineer

RHB Banking GroupBandar Baru Bangi, Selangor, Malaysia
2 hari lalu
Penerangan pekerjaan

Overview

Drive SRE practice and deliver the highest level of system and infrastructure resiliency that meets business and regulatory requirements.

Key Responsibilities :

  • Drive consistent SRE practice across all application, infrastructure and IT security teams
  • Set up and operationalize SRE teams identified for specific application, infrastructure and IT security areas
  • Provide coaching for SRE related functions to SRE engineers and other teams (application and infrastructure support teams) practicing SRE within Group Technology to ensure consistent practice of SRE across teams
  • Contribute to the development and documentation of SRE best practices and procedures across the Group
  • Take ownership of Application Monitoring tools such as Dynatrace and work with vendors to design and drive consistent use of the monitoring tools across all teams
  • Design, develop, and deploy automation scripts and tools to monitor, manage, and optimize systems
  • Analyze system metrics and logs from Dynatrace or other monitoring tools to identify potential problems and areas for improvement
  • Build internal expertise in Application Monitoring tools to continuously support and enhance observability across all relevant areas as technology and business environment changes
  • Train and enable active use of Application Monitoring tools across all application and infrastructure support teams
  • Provide support in deep analysis and troubleshooting of technical issues encountered in the Critical and Required High applications and the underlying supporting infrastructure and IT security components, during normal times and during incident / system downtime
  • Advocate and develop a strong culture of system resiliency and delivery of non-functional requirements
  • Support, validate and sign off delivery of SRE-related non-functional requirements during project implementation
  • Continue to fine-tune and enhance SRE practice as business and technology environment evolves
  • Keep abreast of issues and challenges in system reliability and identify strategic / structural changes to improve
  • Build strong teamwork and collaboration between SRE, Application, IT Infrastructure and all relevant stakeholders within Group Technology
  • Promote continuous learning and culture of innovation within the team
  • Build strategic and mutually-beneficial relationships with technology solution partners and service providers to strengthen the Group’s capabilities

Requirements :

  • Master\'s Degree - Master / Degree in Computer Science, IT or a related discipline
  • 8 - 10 years in IT system development & implementation experience in Financial Services Industry (FSI)
  • 3 - 5 years in system architecture and design related experience
  • Knowledge of mainframe architecture, operations, and management including z / OS, CICS, CICS Transaction Gateway, and other mainframe-specific technologies
  • Programming Languages : Proficiency in COBOL is highly beneficial; knowledge of Java, C#, or scripting languages (e.g., Bash, Python, PowerShell) can be helpful
  • Systems Reliability : Familiarity with principles of systems reliability, including monitoring, automation, and incident management
  • Networking : Understanding of networking concepts, protocols, and troubleshooting
  • Strong experience in designing and delivering non-functional requirements including High Availability, Disaster Recovery, Archiving, Housekeeping, Backup and Recovery
  • Experience and strong appreciation in SRE practice including Service Level Objectives, Service Level Indicators, System Observability, Elimination of Toils, Automation
  • Excellent interpersonal and communication skills and ability to drive a strong SRE culture
  • Strong analytical and problem-solving skills
  • Strong R&D mindset
  • Seniority level

  • Mid-Senior level
  • Employment type

  • Full-time
  • Job function

  • Engineering and Information Technology
  • Industries

  • Financial Services, IT Services and IT Consulting, and Business Consulting and Services
  • #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Reliability Engineer • Bandar Baru Bangi, Selangor, Malaysia