It's
2023
, technology has undergone an accelerated phase as we lived through unprecedented times. The 'tech-tonic' shift has reshaped our day-to-day aspects and we, at Boost, aspire to shake things up further in the financial services scene. In the last 5 years, some of our highlights include : made paying for your roti canai directly from your phone screen, made obtaining your loans completely digital in just 3 minutes, made insurance bite-sized and customizable (think – insurance for phone screens?)
The Boost-RHB consortium is building towards a Digital Bank, where we strive to make innovative financial services such as these convenient, transparent, and most importantly accessible to anyone and everyone. We want to enable better living for our customers through our inclusive financial services that can universally serve and be embedded in their daily lives.
Join us in creating a
roaring
future for Malaysia, don't let this incredible opportunity slip like
2020…too
Responsibilities :
- Provide run the bank 24x7 standby support.
- Monitor and maintain the availability and performance of critical banking systems and applications, identify the current Service Level Indicators (SLIs) ensuring they meet service level objectives (SLOs) and service level agreements (SLAs).
- Troubleshoot and resolve incidents and outages in a timely manner, minimizing impact on customers and business operations.
- Implement and maintain infrastructure automation and orchestration tools to streamline operational processes and improve efficiency.
- Collaborate with cross-functional teams to identify opportunities for performance optimization and scalability enhancements.
- Conduct capacity planning and performance tuning to ensure adequate resources are available to support current and future demand.
- Implement and enforce security best practices to protect sensitive customer data and ensure compliance with regulatory requirements.
- Participate in disaster recovery planning and testing exercises to ensure business continuity in the event of a major outage or disaster.
- Document operational procedures, runbooks, and knowledge base articles to facilitate knowledge sharing and training among team members.
- Provide mentorship and guidance to junior team members, fostering a culture of continuous learning and improvement.
Requirements :
Bachelor's degree in Computer Science, Engineering, or related field.Proven experience as a Site Reliability Engineer or similar role in a production environment, preferably in the banking or financial services industry.Proficiency in Linux system administration and shell scripting.Experience with container orchestration platforms such as Kubernetes.Strong understanding of cloud computing principles and experience with AWS services.Working knowledge of infrastructure-as-code tools such as Terraform.Familiarity with observability tools such as Splunk or ELK stack.Excellent problem-solving skills and a proactive mindset.Strong communication and collaboration skills, with the ability to work effectively in a team environment.