Tawaran kerja ini tidak tersedia di negara anda.

Test Engineer (AI Model Evaluation)

MY E.G. Services Berhad (MYEG)Petaling Jaya, Selangor, Malaysia

2 hari lalu

Penerangan pekerjaan

Overview

We are seeking an AI Model Evaluation Engineer (Junior–Mid Level) with strong QA automation expertise and hands-on experience in evaluating OCR, LLM (chatbot) models, and RAG dataset preparation (speech-to-text, text-to-speech, video-to-OCR). The role focuses on automated testing, ground truth creation, and workflow validation to ensure accuracy, compliance, and real-world reliability for production-ready AI systems.

Responsibilities

Evaluate LLM (chatbot), OCR, and RAG datasets for correctness, bias, compliance, and real-world robustness.
Design and automate test frameworks for RAG pipelines, workflow triggers, and chatbot responses.
Craft and validate ground-truth datasets for OCR, TTS, and speech-to-text projects.
Test chatbot responses for accuracy, context relevance, ethical compliance, and edge cases.
Conduct load testing to ensure system performance under high-traffic and stress scenarios.
Integrate open-source LLM evaluation frameworks (e.g., DeepEval, HuggingFace evaluation tools) into testing pipelines.
Automate data processing & reporting workflows using Google Apps Script and Google Sheets for faster insights.
Document results, define acceptance criteria, and collaborate with ML engineers, data scientists, and QA teams to enhance model reliability.
Support CI / CD pipeline integration for model evaluation and regression testing.

Qualifications

Education

Bachelor’s degree in Computer Science, AI / ML, Software Engineering, Data Science, or a related field.

Master’s degree is a plus but not mandatory.

Experience

1–3 years in QA automation, model evaluation, or NLP / ML testing roles.

Experience in open-source LLM model testing (e.g., DeepEval, RAG testing frameworks).

Hands-on experience in crafting ground-truth datasets for OCR, speech-to-text, or TTS projects.

Exposure to AI chatbot evaluation for bias, fairness, and compliance.

Technical Skills

Programming & Automation : Python, Google Apps Script.

QA Automation : PyTest, Selenium, or similar frameworks.

Seniority level

Entry level

Employment type

Full-time

Job function

Engineering and Information Technology

Industries : IT Services and IT Consulting

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

#J-18808-Ljbffr

Buat amaran kerja untuk carian ini

Test Engineer • Petaling Jaya, Selangor, Malaysia