Job Responsibilities
Oversee the design, implementation, and maintenance of IT systems supporting operational activities, ensuring high availability and performance of GPU resources. Provide technical guidance across complex infrastructure projects. Develop and execute operational strategies aligned with the company’s goals for GPU-as-a-Service, focusing on scalability, efficiency, and reliability. Lead and mentor a diverse team of technology professionals, fostering a culture of innovation, accountability, and continuous improvement. Manage relationships with key vendors and third-party service providers to ensure compliance with SLAs and industry standards. Identify process improvement opportunities across operations and implement best practices to enhance productivity, reduce costs, and improve service quality. Collaborate with product development, sales, and marketing teams to ensure seamless service integration and alignment with customer needs. Ensure compliance with relevant laws, regulations, and industry standards related to data protection and service delivery. Minimum Requirements
Malaysian nationality. Bachelor’s degree in Computer Science or a related technical field. Proven experience of 10+ years in operations or technology leadership within the IT or cloud services industry. Strong understanding of GPU technologies and cloud computing principles. Experience managing complex IT systems and operational processes. Exceptional analytical and troubleshooting skills. Knowledge of Kubernetes environments and debugging capabilities. Familiarity with energy-efficient computing and sustainable data center operations. Ability to manage priorities in a dynamic, fast-paced environment. Hands-on expertise with CPU / GPU clusters and platforms. Excellent communication skills for technical and non-technical audiences. Strong interpersonal skills for developing professional relationships across teams. Proven ability to manage multiple projects with attention to detail. Knowledge of operating and managing CPU / GPU cluster processes. Strategic thinking and ability to implement innovative solutions. Excellent documentation skills for technical designs, issues, and procedures.
#J-18808-Ljbffr
System • Kuala Lumpur, Malaysia