Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

MindriftSeri Kembangan, Selangor, Malaysia
5 days ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Seri Kembangan, Selangor, Malaysia

    Related jobs
    • Promoted
    Senior UA & Growth Lead - Remote & Flexible Hours

    Senior UA & Growth Lead - Remote & Flexible Hours

    Pixlr GroupSeremban, Negeri Sembilan, Malaysia
    A leading photo editing platform company is seeking a hands-on Senior User Acquisition Manager to scale its subscriber base and drive sustainable growth. The role involves managing campaigns across ...Show moreLast updated: 1 day ago
    • Promoted
    AI Evaluator - Cantonese (Chinese) - Malaysia

    AI Evaluator - Cantonese (Chinese) - Malaysia

    CrowdGen by AppenSelayang Municipal Council, Selayang Municipal Council, Malaysia
    Join CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users! We’re looking for detail-oriented contribu...Show moreLast updated: 5 days ago
    • Promoted
    Remote SEO Content Writer for Growth and Impact

    Remote SEO Content Writer for Growth and Impact

    AlphaiotabpoShah Alam, Selangor, Malaysia
    A dynamic digital marketing firm based in Kuala Lumpur is seeking a content creator with expertise in social media advertising. The role involves researching, writing, and optimizing content while a...Show moreLast updated: 2 days ago
    • Promoted
    Azure Architect (AI Adoption / Security)

    Azure Architect (AI Adoption / Security)

    Softenger (Malaysia) Sdn BhdShah Alam, Shah Alam, Malaysia
    Job Title : AI Architect (Adoption / Security).We are hiring for key roles to support a major enterprise‑scale AI transformation program. Candidates will work closely with business and IT teams to driv...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Senior Product Security Engineer - AI & Pen Testing

    Senior Product Security Engineer - AI & Pen Testing

    SitecoreKuala Lumpur, Kuala Lumpur, Malaysia
    A leading digital experience platform provider in Kuala Lumpur is seeking a Senior Product Security Engineer.You will conduct advanced penetration testing and assess AI security, collaborating with...Show moreLast updated: 1 hour ago
    • Promoted
    Technical Writer & AI Community Manager

    Technical Writer & AI Community Manager

    Stronium Sdn BhdKuala Lumpur, Kuala Lumpur, Malaysia
    A leading technology firm in Kuala Lumpur seeks a Technical Writer / Community Manager to keep technical documentation up to date and engage with the partner community. The ideal candidate has stron...Show moreLast updated: 2 days ago
    • Promoted
    Data Analytics Specialist - Associate (Assurance - Data Intelligence Delivery)

    Data Analytics Specialist - Associate (Assurance - Data Intelligence Delivery)

    Ernst & YoungKuala Lumpur, Kuala Lumpur, Malaysia
    EY's commitment to the quality and integrity of our audits is exemplified by our global audit methodology and our thorough quality controls that are applied to every client engagement.Together with...Show moreLast updated: 3 days ago
    • Promoted
    AI Team Lead : Strategy, Mentorship & ML Delivery

    AI Team Lead : Strategy, Mentorship & ML Delivery

    SNS Network (M) Sdn. Bhd.Petaling Jaya, Selangor, Malaysia
    A leading company is seeking an AI Development Team Lead to manage a multidisciplinary team and ensure successful delivery of AI initiatives. The ideal candidate will bridge strategic objectives wit...Show moreLast updated: 2 days ago
    • Promoted
    ASEAN Data & AI Demand Strategist - Omnichannel Growth

    ASEAN Data & AI Demand Strategist - Omnichannel Growth

    IBMPetaling Jaya, Selangor, Malaysia
    A global technology company is seeking a strategic demand strategist for their Data & AI software portfolio in Petaling Jaya, Malaysia. This position focuses on designing and executing omnichannel m...Show moreLast updated: 1 day ago
    • Promoted
    Evaluation Scenario Writer - AI Agent Testing Specialist

    Evaluation Scenario Writer - AI Agent Testing Specialist

    MindriftKajang Municipal Council, Selangor, Malaysia
    Mindrift is looking for a freelance.The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests.You will work on a fl...Show moreLast updated: 5 days ago
    Senior AI Research Engineer, Model Inference (100% Remote)

    Senior AI Research Engineer, Model Inference (100% Remote)

    Tether Operations LimitedKuala Lumpur, 14, MY
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show moreLast updated: 30+ days ago
    • Promoted
    AI Engineer : Build Autonomous AI Workflows (Hybrid)

    AI Engineer : Build Autonomous AI Workflows (Hybrid)

    The Access GroupKuala Lumpur, Kuala Lumpur, Malaysia
    A leading software solutions provider in Kuala Lumpur is looking for AI Engineers.You will design AI tooling and automation systems, engaging with stakeholders to deliver robust data solutions.Idea...Show moreLast updated: 1 day ago
    • Promoted
    Cloud AI / ML Specialist : Generative AI Solutions

    Cloud AI / ML Specialist : Generative AI Solutions

    POINTSTAR (PSSG)Kuala Lumpur, Kuala Lumpur, Malaysia
    A tech consulting firm in Kuala Lumpur is looking for a Cloud Specialist focused on AI and Machine Learning.This position, open to junior and mid-senior candidates, involves advising clients on AI / ...Show moreLast updated: 2 days ago
    • Promoted
    Content Specialist

    Content Specialist

    CartrackKuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
    We are a world-leading smart mobility SaaS company with over 2,000,000 subscribers across 23 countries and we are looking for a Content Specialist to join our team. Our teams are collaborative, vibr...Show moreLast updated: 30+ days ago
    • Promoted
    Lead Talent Acquisition : Build World-Class Teams (Remote)

    Lead Talent Acquisition : Build World-Class Teams (Remote)

    BjakKuala Selangor, Kuala Selangor, Malaysia
    A financial services platform in Malaysia seeks a Lead, Talent Acquisition to drive recruitment efforts.This role involves managing the full-cycle hiring process and ensuring a positive candidate e...Show moreLast updated: 1 day ago
    • Promoted
    AI Engineer (Multi-agent system)

    AI Engineer (Multi-agent system)

    Hiredly XKuala Lumpur, Kuala Lumpur, Malaysia
    As an AI Engineer (Entry to Mid level), you’ll play a hands-on role in the development of our client's multi-agent AI systems. You’ll implement prompt templates, integrate LLMs, build user-facing fe...Show moreLast updated: 30+ days ago
    • Promoted
    Ads Relevance Specialist - AI Data Service and Operations

    Ads Relevance Specialist - AI Data Service and Operations

    TikTokKuala Lumpur, Kuala Lumpur, Malaysia
    Ads Relevance Specialist - AI Data Service and Operations.TikTok, Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. About the team : Our Search Operations team supports our efforts to addres...Show moreLast updated: 30+ days ago
    • Promoted
    (WFH) SEO Content Writer

    (WFH) SEO Content Writer

    Alpha Iota BPO Sdn BhdKuala Lumpur, Malaysia
    Join Our Alpha Iota Family, Where Everyone Wins!.Exciting Work-from-Home Opportunities.Learning & Development Programs to Upskill Yourself. Health and Wellness Perks & Benefits.Motivating and Suppor...Show moreLast updated: 30+ days ago