VISTRA Federal Territory of Kuala Lumpur, Malaysia
Direct message the job poster from VISTRA
It’s never been a more exciting time to join Vistra.
At Vistra our purpose is progress. We believe that our clients have the power to change the world and to do great things for global progress, and we exist to remove the friction that comes from the complexity of global business – to help our clients achieve progress without friction.
But progress only happens when people come together and take action. And we’re absolutely committed to building a culture where our people can do just that.
We have an exciting opportunity for you to join our team as DevOps Engineer. Reporting to Senior Manager this full-time and permanent position is based in Kuala Lumpur
Job Overview
Key Responsibilities :
- Design, implement, and optimize CI / CD pipelines using tools like Azure DevOps to support automated builds, testing (including Katalon Studio integration for QA), and deployments across .NET, Java, Node.js, and other stacks
- Infrastructure as Code (IaC) : Develop and maintain IaC scripts using Terraform or similar tools to provision and configure Azure resources, ensuring consistency across development, UAT, and production environments
- Containerization and Orchestration : Manage containerized environments using Docker and Kubernetes, enabling scalable, isolated test and production setups, including headless test runs for QA automation
- Database and Backup Management : Oversee database configuration, backups, and recovery strategies for Azure databases (e.g., Azure SQL Database, MS SQL), ensuring data integrity and compliance with SOC 2 best practices
- Disaster Recovery and BCP : Design and implement disaster recovery (DR) strategies, leveraging tools like Azure Site Recovery and geo-redundant backups
- RTO / RPO Management : Document and optimize Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for critical systems
- BCP Testing : Conduct and support Business Continuity Planning (BCP) testing to ensure operational resilience
- Monitoring and Logging : Implement and manage monitoring solutions (e.g., Azure Monitor, Application Insights, Site24x7, Prometheus, Grafana) to track system health, performance, and costs, providing real-time alerts and cost optimization recommendations (e.g., Reserved Instances)
- Environment Setup and Optimization : Create reusable templates / scripts to provision Azure VMs, containers, or other resources in hours, not weeks, addressing delays in environment setup and QA VM time limits
- Collaboration with IT Infrastructure : Work closely with the IT infrastructure team to manage network configurations, security policies (e.g., WIZ, Cloudflare), and Azure resource provisioning, navigating dependencies to minimize delays
- Cost Optimization : Analyze resource usage and recommend cost-saving strategies, such as dynamic scaling and Azure Reserved Instances for future cost optimization
- Security and Compliance : Ensure infrastructure and processes adhere to SOC 2 compliance, safeguarding against unauthorized changes and protecting UAT environments from real customer data exposure
- Automation and Scripting : Automate repetitive tasks using Python, PowerShell, Bash, or Azure CLI to improve efficiency and reduce manual intervention
- Troubleshooting : Resolve deployment, infrastructure, and production issues promptly, addressing issues like slow deployment times and resource management ambiguity
Requirements
Required Skills and Experience :
Technical Expertise : Proven experience with CI / CD tools (Azure DevOps, GitHub Actions) and YAML-based pipeline creation, strong proficiency in Infrastructure as Code (IaC) tools like Terraform, hands-on experience with containerization (Docker) and orchestration (Kubernetes), deep knowledge of Azure services and hybrid cloud / on-prem environments, proficiency in scripting languages (Python, PowerShell, Bash, Azure CLI), experience with Microsoft technologies (.NET Framework, .NET Core, IIS, SQL Server), familiarity with monitoring and logging tools (Azure Monitor, Site24x7, Application Insights, Prometheus, Grafana, Loki), knowledge of version control systems (Git, Azure Repos, GitHub)Domain Knowledge : Experience with QA automation tools like Katalon Studio, understanding of database administration, backup, and recovery processes, experience with disaster recovery strategies, including Azure Site Recovery, geo-redundant backups, and RTO / RP documentation, familiarity with load balancing, scalability, and high-availability setups for Microsoft-based applicationsCollaboration and Communication : Strong collaboration skills to work with development teams, QA, and IT infrastructure personnel, ability to navigate complex dependenciesProblem-Solving : Proven ability to troubleshoot and resolve deployment, infrastructure, and production issues, experience in optimizing costs through automation and resource managementCompliance and Security : Understanding of SOC 2 compliance requirements, particularly in separating code creation and deployment roles, familiarity with security tools (e.g., WIZ, Cloudflare) and data privacy practicesBenefits
Company Benefits : At our Malaysia office, we believe in putting our employees’ well-being first! We offer a flexible hybrid working arrangement and birthday leave. Additionally, we provide comprehensive medical insurance and dental coverage, wellness allowance and competitive annual leave entitlement to support your well-being and time to recharge or explore your passions out of work. As advocates of continuous learning and professional development, we provide an internal mentorship program and reimburse professional membership fees for certifications like ICSA, ensuring you stay ahead in your field.
#J-18808-Ljbffr