Talent.com
Tawaran kerja ini tidak tersedia di negara anda.
Test Engineer (AI Model Evaluation)

Test Engineer (AI Model Evaluation)

MY E.G. Services Berhad (MYEG)Petaling Jaya, Selangor, Malaysia
2 hari lalu
Penerangan pekerjaan

Overview

We are seeking an AI Model Evaluation Engineer (Junior–Mid Level) with strong QA automation expertise and hands-on experience in evaluating OCR, LLM (chatbot) models, and RAG dataset preparation (speech-to-text, text-to-speech, video-to-OCR). The role focuses on automated testing, ground truth creation, and workflow validation to ensure accuracy, compliance, and real-world reliability for production-ready AI systems.

Responsibilities

  • Evaluate LLM (chatbot), OCR, and RAG datasets for correctness, bias, compliance, and real-world robustness.
  • Design and automate test frameworks for RAG pipelines, workflow triggers, and chatbot responses.
  • Craft and validate ground-truth datasets for OCR, TTS, and speech-to-text projects.
  • Test chatbot responses for accuracy, context relevance, ethical compliance, and edge cases.
  • Conduct load testing to ensure system performance under high-traffic and stress scenarios.
  • Integrate open-source LLM evaluation frameworks (e.g., DeepEval, HuggingFace evaluation tools) into testing pipelines.
  • Automate data processing & reporting workflows using Google Apps Script and Google Sheets for faster insights.
  • Document results, define acceptance criteria, and collaborate with ML engineers, data scientists, and QA teams to enhance model reliability.
  • Support CI / CD pipeline integration for model evaluation and regression testing.

Qualifications

Education

  • Bachelor’s degree in Computer Science, AI / ML, Software Engineering, Data Science, or a related field.
  • Master’s degree is a plus but not mandatory.
  • Experience

  • 1–3 years in QA automation, model evaluation, or NLP / ML testing roles.
  • Experience in open-source LLM model testing (e.g., DeepEval, RAG testing frameworks).
  • Hands-on experience in crafting ground-truth datasets for OCR, speech-to-text, or TTS projects.
  • Exposure to AI chatbot evaluation for bias, fairness, and compliance.
  • Technical Skills

  • Programming & Automation : Python, Google Apps Script.
  • QA Automation : PyTest, Selenium, or similar frameworks.
  • Seniority level

  • Entry level
  • Employment type

  • Full-time
  • Job function

  • Engineering and Information Technology
  • Industries : IT Services and IT Consulting
  • We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

    #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Test Engineer • Petaling Jaya, Selangor, Malaysia