Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist
Evaluation Scenario Writer - AI Agent Testing SpecialistMindrift • Subang Jaya, Selangor, Malaysia
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift • Subang Jaya, Selangor, Malaysia
11 days ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Subang Jaya, Selangor, Malaysia

    Related jobs
    Search Operations Specialist - Evaluation & Investigation

    Search Operations Specialist - Evaluation & Investigation

    ByteDance • Kuala Lumpur, Kuala Lumpur, Malaysia
    Search Operations Specialist - Evaluation & Investigation.ByteDance Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. The Search Operations team aims to improve search user experience, cont...Show more
    Last updated: 30+ days ago • Promoted
    Freelance AI Agent Assistant

    Freelance AI Agent Assistant

    Mindrift • Puchong, Selangor, Malaysia
    At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI. Mindrift is looking for passionate freelance contributors to j...Show more
    Last updated: 30+ days ago • Promoted
    Training Engineer (Analytics & AI)

    Training Engineer (Analytics & AI)

    Ortechnologies • Kuala Lumpur, Kuala Lumpur, Malaysia
    ORTech : Your Path to Data-Driven Success.Join OR Technologies Sdn Bhd (ORTech), a leading Analytics Engineering company revolutionizing the way organizations leverage data.We're seeking talented in...Show more
    Last updated: 30+ days ago • Promoted
    Oromo Language Specialist - AI Trainer

    Oromo Language Specialist - AI Trainer

    Invisible Expert Marketplace • Subang Jaya, Subang Jaya, Malaysia
    Oromo Language Specialist – AI Trainer.Join to apply for the Oromo Language Specialist – AI Trainer role at Invisible Expert Marketplace. Large-scale language models are evolving rapidly.With high-q...Show more
    Last updated: 30+ days ago • Promoted
    AI QA Trainer – LLM Evaluation

    AI QA Trainer – LLM Evaluation

    Invisible Expert Marketplace • Subang Jaya, Subang Jaya, Malaysia
    AI QA Trainer – LLM Evaluation.Get AI-powered advice on this job and more exclusive features.Are you an AI QA expert eager to shape the future of AI? Large-scale language models are evolving from c...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Engineer

    Senior AI Engineer

    Ella by Crown Digital IO • Kuala Lumpur, Kuala Lumpur, Malaysia
    Join Crown Digital’s Global R&D Center to lead the AI brains behind Ella 2.You’ll architect multi-agent systems, integrate LLMs, and drive real-time intelligence for robots deployed in the real wor...Show more
    Last updated: 30+ days ago • Promoted
    Analytical Chemistry Specialist - AI Trainer

    Analytical Chemistry Specialist - AI Trainer

    Invisible Expert Marketplace • Putrajaya, Putrajaya, Malaysia
    Analytical Chemistry Specialist – AI Trainer.Analytical Chemistry Specialist – AI Trainer.Are you an analytical chemistry expert eager to shape the future of AI? Large‑scale language models are evo...Show more
    Last updated: 30+ days ago • Promoted
    Artificial Intelligence (AI) Specialist

    Artificial Intelligence (AI) Specialist

    iFAST • Kuala Lumpur, Kuala Lumpur, Malaysia
    As an AI Specialist, you will be developing and implementing AI solutions to enhance our financial products and services. You will work on a variety of AI and machine learning projects with the aim ...Show more
    Last updated: 30+ days ago • Promoted
    Search Operations Specialist - Evaluation & Investigation

    Search Operations Specialist - Evaluation & Investigation

    TikTok • Kuala Lumpur, Kuala Lumpur, Malaysia
    Search Operations Specialist - Evaluation & Investigation.TikTok Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. The Search Operations team aims to improve search user experience, contrib...Show more
    Last updated: 22 days ago • Promoted
    Technical Author (multiple roles and seniority levels)

    Technical Author (multiple roles and seniority levels)

    Canonical • Kuala Lumpur, Kuala Lumpur, Malaysia
    Technical Author (multiple roles and seniority levels).Canonical Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. Join or sign in to find your next job.Technical Author (multiple roles and...Show more
    Last updated: 30+ days ago • Promoted
    AI Project

    AI Project

    Freelancing • Cyberjaya, Selangor, Malaysia
    Develab is an IT consulting company operating in Malaysia, Singapore and Indonesia.We continuously seek innovation with a mission to help businesses realize their dreams with quality digital soluti...Show more
    Last updated: 30+ days ago • Promoted
    Technology Consulting - AI & Data (Experienced Hire)

    Technology Consulting - AI & Data (Experienced Hire)

    EY • Kuala Lumpur, Kuala Lumpur, Malaysia
    Technology Consulting - AI & Data (Experienced Hire).EY Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia.Join or sign in to find your next job. Technology Consulting - AI & Data (Experience...Show more
    Last updated: 30+ days ago • Promoted
    Kabuverdianu Language Specialist - AI Trainer

    Kabuverdianu Language Specialist - AI Trainer

    Invisible Expert Marketplace • Subang Jaya, Subang Jaya, Malaysia
    Kabuverdianu Language Specialist - AI Trainer.Are you an experienced Kabuverdianu language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving bey...Show more
    Last updated: 30+ days ago • Promoted
    Lead Writer & Strategist - AI & Digital Infrastructure

    Lead Writer & Strategist - AI & Digital Infrastructure

    Digital • Federal Territory of Kuala Lumpur, MY
    Quick Apply
    Senior Content Strategist & Lead Writer.AI ecosystem to help Singtel’s Digital Infrastructure Company (DICo) sharpen its narrative across AI cloud, data centres, connectivity, GPUs and the broa...Show more
    Last updated: 11 days ago
    AI Engineer

    AI Engineer

    Encore Med Sdn Bhd • Kuala Lumpur, Kuala Lumpur, Malaysia
    Encore Med is a health‑tech company founded in 2016 focusing on innovating digital experience for healthcare operations and has a strong portfolio in transforming numerous business operations and p...Show more
    Last updated: 22 days ago • Promoted
    Specialist, AI Engineering

    Specialist, AI Engineering

    TNG Digital • Kuala Lumpur, Kuala Lumpur, Malaysia
    Direct message the job poster from TNG Digital.Let’s connect – We’re hiring! | Fintech | Openings in both IT and non‑IT fields.Show more
    Last updated: 30+ days ago • Promoted
    AI Application Development Engineer

    AI Application Development Engineer

    dtcpay • Kuala Lumpur, Kuala Lumpur, Malaysia
    Federal Territory of Kuala Lumpur, Malaysia.MAS‑licensed payment service provider that bridges traditional finance and digital assets. We enable businesses to accept and make payments in both fiat a...Show more
    Last updated: 2 days ago • Promoted
    Forward Deployed AI Strategist

    Forward Deployed AI Strategist

    Tarro • Kuala Lumpur, Kuala Lumpur, Malaysia
    At Tarro, we’re embedding AI into the operational core of every team — from Training and HR to customer support and beyond. Our mission isn’t to build AI for its own sake, but to deliver measurable ...Show more
    Last updated: 30+ days ago • Promoted