Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

MindriftPasir Panjang, Negeri Sembilan, Malaysia
9 hours ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Pasir Panjang, Negeri Sembilan, Malaysia

    Related jobs
    • Promoted
    Luo Language Specialist - AI Trainer

    Luo Language Specialist - AI Trainer

    Invisible Expert MarketplacePasir Panjang, Negeri Sembilan, Malaysia
    Join to apply for the Luo Language Specialist - AI Trainer role at Invisible Expert Marketplace.Are you an experienced Luo language professional eager to shape the future of AI? Large-scale languag...Show moreLast updated: 15 days ago
    • Promoted
    Talent Acquisition Specialist

    Talent Acquisition Specialist

    Flowmingo AISeremban, Negeri Sembilan, Malaysia
    Talent Acquisition Specialist – A role in Flowmingo Partner Program.Flowmingo is a Y Combinator‑backed AI platform that's redefining how companies hire. We’re an AI‑powered interview platform helpin...Show moreLast updated: 28 days ago
    • Promoted
    Senior Sales Engineer

    Senior Sales Engineer

    SophosSeremban, Negeri Sembilan, Malaysia
    Sophos is a global leader and innovator of advanced security solutions designed to defeat cyberattacks.The company acquired Secureworks in February 2025, creating the largest pure‑play Managed Dete...Show moreLast updated: 7 days ago
    • Promoted
    Financial Crime Investigator - APAC - Remote

    Financial Crime Investigator - APAC - Remote

    BinancePasir Panjang, Negeri Sembilan, Malaysia
    Binance is the world’s largest crypto exchange, supporting 250M+ users in 100+ countries.Institutional Services, Education, Compliance. Investigate suspicious activity cases within defined SLAs.Main...Show moreLast updated: 30+ days ago
    • Promoted
    Freelance Mathematics Expert - AI Trainer

    Freelance Mathematics Expert - AI Trainer

    MindriftSepang, Sepang, Malaysia
    Freelance Mathematics Expert - AI Trainer.Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features. This opportunity is only for candidates currently residing i...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Testing Engineer

    Testing Engineer

    ContinentalSeremban, Negeri Sembilan, Malaysia
    Coordinating with product design teams on product prototypes and testing parameters.Testing environments and cases for software or hardware systems. Analyzing and troubleshooting issues in testing p...Show moreLast updated: 9 hours ago
    • Promoted
    • New!
    Freelance Data Annotator with Japanese - AI Trainer

    Freelance Data Annotator with Japanese - AI Trainer

    Toloka AnnotatorsSeremban, Negeri Sembilan, Malaysia
    Freelance Data Annotator (Japanese) – AI Trainer.Get AI-powered advice on this job and more exclusive features.This opportunity is only for candidates currently residing in the specified country.Yo...Show moreLast updated: 9 hours ago
    • Promoted
    • New!
    Principal Engineer Test Technology and Innovation

    Principal Engineer Test Technology and Innovation

    Infineon TechnologiesMalacca City, Malacca, Malaysia
    Principal Engineer Test Technology and Innovation.Principal Engineer Test Technology and Innovation.This role involves defining and developing optimal test solutions and handling capabilities for n...Show moreLast updated: 9 hours ago
    • Promoted
    Real Estate Agent

    Real Estate Agent

    MegaHarta Real EstateNilai, Negeri Sembilan, Malaysia
    MegaHarta Real Estate Group, established in March 2002, is a leading property agency based in Kuala Lumpur and Petaling Jaya, Malaysia. As a licensed agency registered with The Board of Valuers, App...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    AI Evaluator - Cantonese (Chinese) - Malaysia

    AI Evaluator - Cantonese (Chinese) - Malaysia

    CrowdGen by AppenNilai, Negeri Sembilan, Malaysia
    Join CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users! We’re looking for detail-oriented contribu...Show moreLast updated: 9 hours ago
    • Promoted
    Freelance Automotive / Mechanical Engineering - QA / AI Trainer

    Freelance Automotive / Mechanical Engineering - QA / AI Trainer

    MindriftSeremban, Negeri Sembilan, Malaysia
    Freelance Automotive / Mechanical Engineering - QA / AI Trainer.At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI....Show moreLast updated: 15 days ago
    • Promoted
    Medical Specialist (Fluent in Arabic) – AI Trainer

    Medical Specialist (Fluent in Arabic) – AI Trainer

    Invisible Expert MarketplaceSepang, Sepang, Malaysia
    Medical Specialist (Fluent in Arabic) – AI Trainer.Are you a medical professional eager to shape the future of AI? Large-scale language models are evolving from clever chatbots into powerful engine...Show moreLast updated: 16 days ago
    • Promoted
    • New!
    B2B EDI Solution Analyst (Remote)

    B2B EDI Solution Analyst (Remote)

    HCLTechSepang, Selangor, Malaysia
    Human Resource | Talent Acquisition | Recruitment | Talent Sourcing.As a B2B Solution Analyst, you will be part of the Client’s B2B team responsible for enabling seamless data exchange between clie...Show moreLast updated: 9 hours ago
    • Promoted
    Solutions Architect (Insurance) - Fully Remote

    Solutions Architect (Insurance) - Fully Remote

    CoverGo | InsurtechNilai, Negeri Sembilan, Malaysia
    Working on the latest tech for the Insurtech Market Leader.At CoverGo, our mission is to empower all insurance companies to make insurance 100% digital and accessible to everyone.We are a leading g...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Enterprise AI Customer Success Manager

    Enterprise AI Customer Success Manager

    Wallaroo.AISeremban, Negeri Sembilan, Malaysia
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Wanted to work on the next wave of AI? Join Wallaroo. AI and help us build the next generation of enter...Show moreLast updated: 9 hours ago
    • Promoted
    • New!
    Freelance Medical Content Writer

    Freelance Medical Content Writer

    DigitalMalacca City, Malacca, Malaysia
    Hire Digital is seeking a remote.Freelance Medical Content Writer.The ideal candidate has a solid background in medical or pharmaceutical writing and can translate complex topics into clear and eng...Show moreLast updated: 9 hours ago
    Become a Luxury Brand Evaluator Automobile Project in Melaka, Malaysia

    Become a Luxury Brand Evaluator Automobile Project in Melaka, Malaysia

    CXGMalacca, Malacca, MY
    Quick Apply
    Are you a luxury automobile enthusiast who appreciates the finer details of high-end vehicles? If the answer is yes, we are looking for you!. As a Luxury Brand Evaluator, you will step into the worl...Show moreLast updated: 30+ days ago
    • Promoted
    Workforce Analyst

    Workforce Analyst

    Keypath EducationSepang, Sepang, Malaysia
    Keypath Education – Ranked Best Place to Work in Australia by WRK+ (2021 – 2024).Keypath Education is a global EdTech company partnering with leading universities to deliver online programs that ad...Show moreLast updated: 10 days ago