About Swift
We’re the world’s leading provider of secure financial messaging services, headquartered in Belgium. We are the way the world moves value – across borders, through cities and overseas. No other organisation can address the scale, precision, pace and trust that this demands, and we’re proud to support the global economy. We’re unique too. We were established to find a better way for the global financial community to move value – a reliable, safe and secure approach that the community can trust, completely. We’re always striving to be better and are constantly evolving in an ever-changing landscape, without undermining that trust. Five decades on, our vibrant community reflects the complexity and diversity of the financial ecosystem. We innovate diligently, test exhaustively, then implement fast. In a connected and exciting era, our mission has never been more relevant. Swift now has a presence in 200+ countries and legal territories to serve a community of more than 12,000 banks and financial institutions.
Role
Senior Site Reliability Engineer at Swift – Kuala Lumpur, Malaysia
Responsibilities
- Contribute to system design and deployment phases with a focus on scalability, reliability, and operability. Ensure that production readiness is considered at every stage of the software lifecycle.
- Develop automation scripts, infrastructure as code, and tooling using industry best practices to improve system reliability, reduce manual effort, and enable self-service.
- Review system architectures, deployment strategies, observability setups, and operational documentation to ensure reliability and operational excellence.
- Analyze production issues, identify root causes, and implement long-term reliability improvements through automation, monitoring, and architectural enhancements.
- Work collaboratively with other team members and provide guidance to more junior team members.
- Organize an efficient handover through high quality documentation and training.
- Automate the deployment and operation of multi-tenant infrastructure, handling tasks that ensure system resilience and availability.
- Develop and maintain monitoring tools, dashboards, and self-healing mechanisms.
- Participate in on-call rotations, conduct blameless postmortems, and drive continuous learning.
- Work closely with developers, product teams, and engineering stakeholders to troubleshoot issues, improve systems, and integrate reliability improvements.
What will make you successful?
Minimum 6 years of experience in Site Reliability Engineering or software development within an international company.Hands-on experience with CI / CD and deployment tools such as Ansible, Jenkins, Maven, Nexus, Git, and Docker.Proficiency in Linux OS.Proficiency in scripting and automation (e.g. Python, PowerShell, YAML) with the ability to develop tools and infrastructure as code.Familiarity with Java-based systems with the ability to understand code for root cause analysis.Understanding of distributed systems and microservices architectures, including REST and SOAP APIs.Experience with databases, including NoSQL platforms.Familiarity with performance and reliability testing tools such as JMeter or Postman.Exposure to observability and analytics technologies; experience with Elasticsearch or reporting tools like Power BI is a plus.Practical experience working in Agile-driven teams.Strong interpersonal and communication skills, with a customer-centric mindset and the ability to work effectively across cultures.Demonstrated ability to collaborate with distributed teams across multiple time zones.What We Offer
We put you in control of careerWe give you a competitive packageWe help you perform at your bestWe help you make a differenceWe give you the freedom to be yourselfSeniority level
Mid-Senior levelEmployment type
Full-timeJob function
Engineering and Information TechnologyEEO statements and related information may be included in the application process; please review the employer’s official postings for complete details.
#J-18808-Ljbffr