Overview
AI Training Environment Developer - Linux System Administration at Lavu Tech Solutions Sdn Bhd. This individual contributor role reports to the Manager of the Global Factory IT group. We are looking for a sharp, driven, and autonomy-loving team member to join our IT team to shape and build efficient self-learning applications. The contractor will be involved in developing an AI Development and Model Training environment to support global AI Solutions development needs. Key Responsibilities
Design and develop a global AI Training environment that dynamically allocates compute and GPU resources based on model training requirements. Integrate the training environment with the existing Factory MLOps platform for model tracking, cataloging, and deployment using tools such as MLFlow and KServe. Develop a common UI portal for Data Scientists and Communities of Practitioners (CoP) to access the environment, and provide APIs for integration with other factory systems. Collaborate with teams across different sites and departments to ensure the environment meets stakeholder requirements. Implement the environment across Seagates hybrid infrastructure, including AWS cloud and on-premises systems. Support departmental usage tracking and billing through a chargeback model. Provide technical support to users encountering issues. Qualifications
Bachelors or Masters degree in Computer Science or a related field. Outstanding analytical and problem-solving skills. Familiarity with containerized environments, Kubernetes / Docker, and Rancher. Strong understanding of data structures, microservices application design, network protocols, publish-subscribe models, JSON. Proficiency in Python, VueJS, and web services. Experience with virtual machines (VMs), containerized systems, and cloud infrastructure basics (AWS). Operating system experience in Linux. Proven experience in Linux system administration and containerized system development. Hands-on experience with messaging technologies such as RabbitMQ or Kafka. Understanding of GenAI, AI / ML solution architecture, and deployment in manufacturing environments. Excellent communication skills, stakeholder engagement, and team collaboration. Seniority level
Entry level Employment type
Full-time Job function
Information Technology Industries
Data Infrastructure and Analytics, Technology, Information and Internet, and Software Development
#J-18808-Ljbffr
System System • Kuala Lumpur, Malaysia