Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist
Evaluation Scenario Writer - AI Agent Testing SpecialistMindrift • Pasir Gudang, Johor, Malaysia
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift • Pasir Gudang, Johor, Malaysia
1 day ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Pasir Gudang, Johor, Malaysia

    Related jobs
    Kami sedang mencari guru les privat SEO di Tanjungpinang

    Kami sedang mencari guru les privat SEO di Tanjungpinang

    Superprof • Tanjungpinang, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Show more
    Last updated: 30+ days ago • Promoted
    Senior AI Research Engineer, Model Inference (100% Remote)

    Senior AI Research Engineer, Model Inference (100% Remote)

    Tether Operations Limited • Johor Bahru, 01, MY
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show more
    Last updated: 30+ days ago
    Project Lead (Data Science / AI)

    Project Lead (Data Science / AI)

    Flintex Consulting Pte Ltd • Singapore, 01, sg
    Quick Apply
    The Project Lead is responsible for the successful delivery of large scale, complex tech-enabled change projects, managing and coordinating the full project lifecycle across diverse industries.You ...Show more
    Last updated: 30+ days ago
    Artificial Intelligence Engineer

    Artificial Intelligence Engineer

    Glints • Batam, Kepulauan Riau, Indonesia
    Artificial Intelligence Engineer in Batam.AI / LLM technology development team.This role will oversee the design, development, and deployment of advanced. AI and Large Language Model (LLM).The ideal c...Show more
    Last updated: 30+ days ago • Promoted
    Freelance Data Annotator with Japanese - AI Trainer

    Freelance Data Annotator with Japanese - AI Trainer

    Toloka Annotators • Kulai, Johor, Malaysia
    Freelance Data Annotator (Japanese) – AI Trainer.Get AI-powered advice on this job and more exclusive features.This opportunity is only for candidates currently residing in the specified country.Yo...Show more
    Last updated: 1 day ago • Promoted
    Operations Specialist, Trust & Safety

    Operations Specialist, Trust & Safety

    BandLab Technologies • Kulai, Johor, Malaysia
    Operations Specialist, Trust & Safety.We are looking for a detail-oriented and proactive Operations Specialist to join our Trust & Safety team. You will play a key role in maintaining the integrity ...Show more
    Last updated: 11 days ago • Promoted
    Freelance AI Agent Assistant

    Freelance AI Agent Assistant

    Mindrift • Johor Bahru, Johor, Malaysia
    At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI. Mindrift is looking for passionate freelance contributors to j...Show more
    Last updated: 29 days ago • Promoted
    Senior Engineer, Quality Intelligence

    Senior Engineer, Quality Intelligence

    Alcon • Batam, Kepulauan Riau, Indonesia
    Direct message the job poster from Alcon.At Alcon, we're passionate about enhancing sight and helping people see brilliantly. With more than 25,000 associates, we innovate fearlessly, champion progr...Show more
    Last updated: 1 day ago • Promoted
    Kami sedang mencari guru les privat JavaScript di Tanjungpinang

    Kami sedang mencari guru les privat JavaScript di Tanjungpinang

    Superprof • Tanjungpinang, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Show more
    Last updated: 30+ days ago • Promoted
    AI Engineer

    AI Engineer

    Flintex Consulting Pte Ltd • Singapore, 01, sg
    Quick Apply
    As an AI Engineer, you will leverage cutting-edge AI to solve complex, industry-specific problems, particularly within the maritime sector. You will be instrumental in a rapidly evolving, client-cen...Show more
    Last updated: 30+ days ago
    AI Research Lead LLM & Multimodal PostTraining

    AI Research Lead LLM & Multimodal PostTraining

    Tether Operations Limited • Singapore, 01, SG
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show more
    Last updated: 15 days ago
    Freelance Automotive / Mechanical Engineering - QA / AI Trainer

    Freelance Automotive / Mechanical Engineering - QA / AI Trainer

    Mindrift • Kulai, Johor, Malaysia
    Freelance Automotive / Mechanical Engineering - QA / AI Trainer.At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI....Show more
    Last updated: 16 days ago • Promoted
    Freelance Medical Content Writer

    Freelance Medical Content Writer

    Digital • Kulai, Johor, Malaysia
    Hire Digital is seeking a remote.Freelance Medical Content Writer.The ideal candidate has a solid background in medical or pharmaceutical writing and can translate complex topics into clear and eng...Show more
    Last updated: 1 day ago • Promoted
    Evaluation Scenario Writer - AI Agent Testing Specialist

    Evaluation Scenario Writer - AI Agent Testing Specialist

    Mindrift • Kulai, Johor, Malaysia
    Mindrift is looking for a freelance.The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests.You will work on a fl...Show more
    Last updated: 1 day ago • Promoted
    Kami sedang mencari guru les privat Biologi di Tanjungpinang

    Kami sedang mencari guru les privat Biologi di Tanjungpinang

    Superprof • Tanjungpinang, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Show more
    Last updated: 30+ days ago • Promoted
    AI Evaluator - Cantonese (Chinese) - Malaysia

    AI Evaluator - Cantonese (Chinese) - Malaysia

    CrowdGen by Appen • Pasir Gudang, Johor, Malaysia
    Join CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users! We’re looking for detail-oriented contribu...Show more
    Last updated: 1 day ago • Promoted
    Project Quality Engineer | Senai

    Project Quality Engineer | Senai

    Hirehub Management Sdn. Bhd. • Senai, Johor, Malaysia
    Our client is a MNC company from China, with over 30 years of strong expertise in the global automotive components industry. They specialize in design and manufacturing of automotive motor...Show more
    Last updated: 30+ days ago
    (Senior) DevOps Engineer- AI Innovation

    (Senior) DevOps Engineer- AI Innovation

    Crypto.com • Singapore, Other, Singapore, 048424
    Senior) DevOps Engineer- AI Innovation.We are a team to design, develop, maintain, and improve software for various ventures projects, i. You will be actively involved in the design of various compo...Show more
    Last updated: 30+ days ago