Talent.com
Tawaran kerja ini tidak tersedia di negara anda.
X9546VV3 |【中文岗】Senior Operations Engineer (SRE / AI Platform) 高级运维工程师(SRE / 人工智能平台)

X9546VV3 |【中文岗】Senior Operations Engineer (SRE / AI Platform) 高级运维工程师(SRE / 人工智能平台)

TTUKofferKuala Lumpur, Malaysia
22 hari lalu
Jenis pekerjaan
  • Quick Apply
Penerangan pekerjaan
  • 工作地点:吉隆坡 KL
  • 薪资范围:RM14,700 - RM17,700
  • 工作签证:不提供
  • 职位亮点

    • 加入全球领先的AI基础设施服务提供商的国际团队,参与构建和运维尖端AI平台。
    • 独立负责全球用户的生产环境,直接影响核心服务的可靠性与性能。
    • 深度接触多云架构、GPU计算和自动化运维,积累高价值技术经验。
    • 跨文化协作环境,与中美技术团队紧密合作,提升中英文双语技术沟通能力。
    • 核心职责

    • 端到端运维 ownership:全面负责AI基础设施产品(Model-API、Serverless、GPU实例)的可用性、延迟、性能与效率。
    • 故障响应与管理:作为生产事件第一响应人,深入排查根因(RCA),实施预防措施,并参与轮值待命。
    • 自动化与工具开发:设计和维护自动化脚本与工具,实现运维任务、部署和故障恢复的流程化。
    • 监控与告警体系:构建并优化监控告警系统(如Prometheus / Grafana),实现问题主动发现。
    • 基础设施即代码(IaC):使用Terraform / Ansible等工具管理云基础设施,保障环境一致性与可重复性。
    • 性能与成本优化:持续分析系统性能与资源使用,识别瓶颈并优化云平台(AWS / GCP / Azure)成本。
    • 跨职能协作:与中方工程团队密切合作,理解新功能、提供运维反馈,并确保新服务达到生产就绪状态。
    • 硬性要求

    • 5年以上DevOps / SRE / 云运维经验,有科技或云服务公司背景优先。
    • 精通至少一家主流云平台(AWS / GCP / Azure);具备容器化与编排技术实战经验(必须掌握Docker / Kubernetes)。
    • 熟练使用至少一种脚本语言(如Python / Go / Shell);掌握Terraform / Ansible等IaC工具。
    • 具备监控与可观测性工具(如Prometheus / Grafana / ELK)的实战经验。
    • 系统化的问题排查能力,能在压力下冷静处理复杂分布式系统问题。
    • 中英文双语流利(书面和口语),能胜任跨团队技术沟通。
    • 具备高度责任心和自驱力,适应远程 / 分布式团队独立工作模式。
    • 加分项:有GPU加速计算环境经验;熟悉MLOps工具(如Kubeflow / MLflow);了解Serverless技术及CI / CD流水线。
    • 如何申请?

      点击'Apply'申请或发送简历至[apply@ttukoffer.co.uk] ,邮件标题注明[申请 WBX9546VV3]。推荐奖金:成功推荐人选可获得推荐奖励。详情: https : / / ttukoffer.co.uk / refer-a-friend-bonus /

      [Mandarin-speaking Role] Senior Operations Engineer (SRE / AI Platform)

    • Location : Kuala Lumpur
    • Compensation : RM10,000 - RM15,000
    • Visa Sponsorship : Not Available
    • Job Highlights

    • Join the international team of a leading global AI infrastructure service provider to build and operate cutting-edge AI platforms.
    • Take end-to-end ownership of production environments for global users, directly impacting core service reliability and performance.
    • Gain deep exposure to multi-cloud architecture, GPU computing, and automated operations in a high-impact role.
    • Collaborate in a multicultural environment with engineering teams across China and North America, enhancing bilingual technical communication skills.
    • Key Responsibilities

    • End-to-End Service Ownership : Assume primary responsibility for the availability, latency, performance, and efficiency of AI infrastructure products (Model-API, Serverless, GPU Instances).
    • Incident Management & Response : Act as the first responder for production incidents, perform root cause analysis (RCA), and implement preventive measures. Participate in an on-call rotation.
    • Automation & Tooling : Design, build, and maintain automation scripts and tools to streamline operational tasks, deployments, and failure recovery.
    • Monitoring & Alerting : Develop and refine monitoring and alerting systems (e.g., Prometheus / Grafana) to enable proactive issue detection.
    • Infrastructure as Code (IaC) : Manage and provision cloud infrastructure using IaC tools (e.g., Terraform, Ansible) to ensure consistency and repeatability.
    • Performance & Cost Optimization : Continuously analyze system performance and resource utilization to identify bottlenecks and optimize cloud platform (AWS / GCP / Azure) costs.
    • Cross-Functional Collaboration : Work closely with engineering teams in China to understand new features, provide operational feedback, and ensure production readiness of new services.
    • Must-Have Requirements

    • 5+ years of hands-on experience in DevOps, SRE, or cloud operations, preferably in a tech or cloud service company.
    • Expertise in at least one major cloud provider (AWS / GCP / Azure); practical experience with containerization and orchestration technologies (Docker / Kubernetes required).
    • Proficiency in at least one scripting language (e.g., Python, Go, Shell); solid understanding of IaC tools like Terraform / Ansible.
    • Hands-on experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK Stack).
    • Systematic problem-solving skills with the ability to troubleshoot complex distributed systems under pressure.
    • Professional fluency in both English and Mandarin (written and spoken) for effective cross-regional collaboration.
    • Strong sense of ownership and self-drive, with the ability to work independently in a remote / distributed team setting.
    • Nice to Have : Experience with GPU-accelerated computing; knowledge of MLOps tools (e.g., Kubeflow, MLflow); familiarity with serverless technologies and CI / CD pipelines.
    • How to Apply?

      Click 'Apply' or send your resume to [apply@ttukoffer.co.uk] with the subject line [Apply to WBX9546VV3]. Refer a friend for this role and earn referral bonuses! See details : https : / / ttukoffer.co.uk / refer-a-friend-bonus /

      By applying, you acknowledge that TT UKoffer Ltd may process your personal data for recruitment purposes under the lawful basis of legitimate interest. This includes sharing your CV with potential employers. We comply with UK GDPR regulations, and you may request data removal at any time by contacting apply@ttukoffer.co.uk.

    Buat amaran kerja untuk carian ini

    Platform Engineer • Kuala Lumpur, Malaysia

    Pekerjaan yang berkaitan
    • Dinaikkan pangkat
    DevOps Engineer

    DevOps Engineer

    iSoftStoneKuala Lumpur, Kuala Lumpur, Malaysia
    SoftStone Federal Territory of Kuala Lumpur, Malaysia.A leading global technology conglomerate renowned for its extensive ecosystem of digital services and platforms. With a strong presence in cloud...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    DevOps Engineer

    DevOps Engineer

    N1 HealthcareCyberjaya, Selangor, Malaysia
    Healthcare, we are transforming how individuals understand and manage their health.Architect High-Availability Infrastructure : . Design, implement, and automate a sophisticated, multi-cloud infrastru...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    • Dinaikkan pangkat
    Senior iOS Engineer (Malaysia Remote)

    Senior iOS Engineer (Malaysia Remote)

    GoodNotes LimitedKepong, Malaysia
    Asia Time Zone At Goodnotes, we believe that every individual holds untapped potential waiting to be unleashed.By reimagining the way we interact with information, we’re merging human creativity wi...Tunjukkan lagiKemas kini terakhir: 2 hari yang lalu
    • Dinaikkan pangkat
    Full Stack Engineer AI (Remote)

    Full Stack Engineer AI (Remote)

    ASPEN - Bjak Sdn BhdSeremban, Malaysia
    Working arrangement : Remote - remote in Vietnam Build Intelligent Systems from Model to UI - and Everything in Between At BJAK, we're using AI to reinvent how financial services work across Southea...Tunjukkan lagiKemas kini terakhir: 2 hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Freelance QA Reviewer - Indonesian to Chinese (Malaysia)

    Freelance QA Reviewer - Indonesian to Chinese (Malaysia)

    WelocalizeKlang Municipal Council, Klang Municipal Council, Malaysia
    Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models. As a Welocalize brand, WeloDat...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    • Dinaikkan pangkat
    • Baharu!
    AIOps Engineer

    AIOps Engineer

    Razer Inc.Shah Alam, Selangor, Malaysia
    AIOps Engineer page is loaded## AIOps Engineerlocations : Shah Alamtime type : Full timeposted on : Posted Todayjob requisition id : JR Joining Razer will place you on a global mission to revol...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Freelance Translator - Indonesian to Chinese (Malaysia)

    Freelance Translator - Indonesian to Chinese (Malaysia)

    WelocalizeSelayang Municipal Council, Selayang Municipal Council, Malaysia
    Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models. As a Welocalize brand, WeloDat...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    • Dinaikkan pangkat
    Engineer I

    Engineer I

    YokogawaPuchong, Selangor, Malaysia
    Yokogawa, award winner for ‘Best Asset Monitoring Technology’ and ‘Best Digital Twin Technology’ at the HP Awards, is a leading provider of industrial automation, test and measurement, information ...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Solutions Architect (Insurance) - Fully Remote

    Solutions Architect (Insurance) - Fully Remote

    CoverGo | InsurtechKuala Selangor, Kuala Selangor, Malaysia
    Working on the latest tech for the Insurtech Market Leader.At CoverGo, our mission is to empower all insurance companies to make insurance 100% digital and accessible to everyone.We are a leading g...Tunjukkan lagiKemas kini terakhir: 16 hari yang lalu
    • Dinaikkan pangkat
    Senior Specialist, DevOps

    Senior Specialist, DevOps

    TNG DigitalKuala Lumpur, Kuala Lumpur, Malaysia
    Let's connect - We're hiring! | Fintech | Openings in both IT and non-IT fields.We fuel the ideas and ambitions of our people with an environment built on Our DNA of Love, Entrepreneurship, Agility...Tunjukkan lagiKemas kini terakhir: 21 hari yang lalu
    • Dinaikkan pangkat
    Unix AIX Engineer

    Unix AIX Engineer

    OCBCCyberjaya, Selangor, Malaysia
    Provide Level 3 support for IBM AIX operating systems and related technologies.Troubleshoot and resolve complex hardware and software issues on IBM AIX systems. Perform system administration tasks s...Tunjukkan lagiKemas kini terakhir: 21 hari yang lalu
    • Dinaikkan pangkat
    Senior Business Analyst - Insurance, Cantonese Speaker (Fully Remote)

    Senior Business Analyst - Insurance, Cantonese Speaker (Fully Remote)

    CoverGoSeremban, Negeri Sembilan, Malaysia
    Working on the latest tech for the Insurtech Market Leader.At CoverGo, our mission is to empower all insurance companies to make insurance 100% digital and accessible to everyone.We are a leading g...Tunjukkan lagiKemas kini terakhir: 16 hari yang lalu
    • Dinaikkan pangkat
    Compliance Engineer (Quality, Environment, Safety & Health)

    Compliance Engineer (Quality, Environment, Safety & Health)

    Neways Electronics International NVKlang City, Selangor, Malaysia
    You champion quality and safety by ensuring our products meet the highest standards, while building a safe and sustainable workplace for everyone. You drive audits, compliance, and continuous improv...Tunjukkan lagiKemas kini terakhir: 29 hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Account Executive

    Account Executive

    ECOS LINK SOLUTIONS SDN BHDSeremban, Negeri Sembilan, Malaysia
    Get AI-powered advice on this job and more exclusive features.Direct message the job poster from ECOS LINK SOLUTIONS SDN BHD. Chief Operating Officer in Ecos Link Solutions.To manage the day-to-day ...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    • Dinaikkan pangkat
    SENIOR EXECUTIVE HUMAN RESOURCES (CHINESE SPEAKING)

    SENIOR EXECUTIVE HUMAN RESOURCES (CHINESE SPEAKING)

    DYNAMIC TRANSFORMS SDN BHDBatu Caves, Selangor, Malaysia
    This job is for a Senior Executive in Human Resources, managing payroll, hiring, and employee relations.You might like this job because it’s a full-time role that helps shape the workplace, ensurin...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Executive – Business Development (China Car Accounts)

    Executive – Business Development (China Car Accounts)

    Pong Codan Rubber (M) Sdn BhdRawang, Selangor, Malaysia
    We are looking for a dynamic and driven.Business Development Executive.The ideal candidate is in their 30s, highly motivated, aggressive in approach, and able to work at a fast pace in line with th...Tunjukkan lagiKemas kini terakhir: kurang daripada 1 jam yang lalu
    Lead / Senior DevOps Engineer

    Lead / Senior DevOps Engineer

    Two95 International Inc.Kuala Lumpur, Federal Territory of Kuala Lumpur, MY
    Quick Apply
    We currently have an opening for a DevOps.You will cooperate with interdisciplinary teams in projects.Maintain a secure and reliable infrastructure for delivery services. Operate and maintaining pro...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Full Stack Engineer AI (Remote)

    Full Stack Engineer AI (Remote)

    Bjak Sdn BhdKepong, Malaysia
    Working arrangement : Remote - remote in Vietnam Build Intelligent Systems from Model to UI - and Everything in Between At BJAK, we're using AI to reinvent how financial services work across Southea...Tunjukkan lagiKemas kini terakhir: 2 hari yang lalu