Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

MindriftBatu Kawan, Penang, Malaysia
4 days ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Batu Kawan, Penang, Malaysia

    Related jobs
    • Promoted
    • New!
    AI Delivery Project Lead — Remote & Growth Path

    AI Delivery Project Lead — Remote & Growth Path

    CheminSeberang Perai, Penang, Malaysia
    An innovative AI company in Kuala Lumpur seeks a Project Management Executive to support AI data labeling projects.This entry-level position offers hands-on experience across the project lifecycle ...Show moreLast updated: 13 hours ago
    • Promoted
    Content Creator

    Content Creator

    DK UKTIMATE (M) SDN BHDSungai Petani, Kedah, Malaysia
    Diploma in Marketing, Communications, or a related field (or equivalent experience).Proven experience creating content across different platforms (social media, blogs, websites, video, etc.Strong w...Show moreLast updated: 13 days ago
    • Promoted
    QA Specialist (Customer Service)

    QA Specialist (Customer Service)

    SummitNext Technologies Sdn BhdSungai Petani, Kedah, Malaysia
    We, SummitNext Technologies Sdn.BPO and Technology Solutions provider, where innovation meets excellence.As we continue our rapid expansion, we are on the lookout for passionate and driven individu...Show moreLast updated: 30+ days ago
    Senior AI Research Engineer, Model Inference (100% Remote)

    Senior AI Research Engineer, Model Inference (100% Remote)

    Tether Operations LimitedMalacca, 04, MY
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Azure Architect (AI Adoption / Security)

    Azure Architect (AI Adoption / Security)

    Softenger (Malaysia) Sdn BhdKulim, Kedah, Malaysia
    Job Title : AI Architect (Adoption / Security).We are hiring for key roles to support a major enterprise‑scale AI transformation program. Candidates will work closely with business and IT teams to driv...Show moreLast updated: 13 hours ago
    • Promoted
    Senior Specialist / Specialist - Test (Impedance Measurement)

    Senior Specialist / Specialist - Test (Impedance Measurement)

    AT&SKulim, Kedah, Malaysia
    Senior Specialist / Specialist - Test (Impedance Measurement).Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features. IC substrates and printed circuit boards....Show moreLast updated: 30+ days ago
    • Promoted
    Design Verification Engineer

    Design Verification Engineer

    Lattice SemiconductorPenangMalaysia, Penang, Malaysia
    There is energy here… energy you can feel crackling at any of our international locations.It’s an energy generated by enthusiasm for our work, for our teams, for our results, and for our customers....Show moreLast updated: 2 days ago
    • Promoted
    Become a Luxury Brand Evaluator Automobile Project in Penang, Malaysia

    Become a Luxury Brand Evaluator Automobile Project in Penang, Malaysia

    CXGGeorge Town, Penang, Malaysia
    Are you a luxury automobile enthusiast who appreciates the finer details of high-end vehicles? If the answer is yes, we are looking for you! As a Luxury Brand Evaluator, you will step into the worl...Show moreLast updated: 30+ days ago
    • Promoted
    Remote Content Writer - Biweekly Pay & Growth Opportunities

    Remote Content Writer - Biweekly Pay & Growth Opportunities

    KimpSeberang Perai, Penang, Malaysia
    A content creation company is seeking a Kimp Content Writer to enjoy flexible remote work while delivering high-quality content. Responsibilities include enriching digital platforms and engaging in ...Show moreLast updated: 1 day ago
    • Promoted
    Nahuatl Language Expert - AI Trainer

    Nahuatl Language Expert - AI Trainer

    Invisible Expert MarketplaceBayan Lepas, Penang, Malaysia
    Nahuatl Language Expert – AI Trainer.We’re looking for Nahuatl language specialists who live and breathe regional dialects, phonetics, syntax, idiomatic expressions, oral traditions, and cultural r...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Software Engineer - AI / ML Models to APIs

    Senior Software Engineer - AI / ML Models to APIs

    TORAY MALAYSIA SYSTEMS SOLUTION SDN. BHD.PenangMalaysia, Penang, Malaysia
    A technology solutions provider in Malaysia is seeking a Software Engineer / Senior Software Engineer.The role involves gathering specifications for new features, completing projects on time, and pro...Show moreLast updated: 1 day ago
    • Promoted
    AI Data Specialist - Chinese

    AI Data Specialist - Chinese

    RWS GroupCentral George Town, Penang, Malaysia
    AI Data Specialist - Chinese (Remote).We are looking for an AI Data Specialist to support the improvement of AI-generated content in English. This is a freelance, part‑time role based remotely with ...Show moreLast updated: 22 days ago
    Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

    Senior Research Engineer Multimodal & Video Foundation Model (100% Remote)

    Tether Operations LimitedMalacca, 04, MY
    Join Tether and Shape the Future of Digital Finance.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from ex...Show moreLast updated: 30+ days ago
    • Promoted
    • New!
    Senior UA & Growth Lead - Remote & Flexible Hours

    Senior UA & Growth Lead - Remote & Flexible Hours

    Pixlr GroupSungai Petani, Kedah, Malaysia
    A leading photo editing platform company is seeking a hands-on Senior User Acquisition Manager to scale its subscriber base and drive sustainable growth. The role involves managing campaigns across ...Show moreLast updated: 13 hours ago
    • Promoted
    Content Creator Specialist - AI Trainer

    Content Creator Specialist - AI Trainer

    Invisible Expert MarketplaceBatu Kawan, Penang, Malaysia
    Content Creator Specialist - AI Trainer.Be among the first 25 applicants.Are you a creative content creator eager to shape the future of AI?. Large-scale language and vision models are transforming ...Show moreLast updated: 8 days ago
    • Promoted
    AI Evaluator - Cantonese (Chinese) - Malaysia

    AI Evaluator - Cantonese (Chinese) - Malaysia

    CrowdGen by AppenCentral George Town, Penang, Malaysia
    Join CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users! We’re looking for detail-oriented contribu...Show moreLast updated: 4 days ago
    • Promoted
    Remote SEO Content Writer for Growth and Impact

    Remote SEO Content Writer for Growth and Impact

    AlphaiotabpoKulim, Kedah, Malaysia
    A dynamic digital marketing firm based in Kuala Lumpur is seeking a content creator with expertise in social media advertising. The role involves researching, writing, and optimizing content while a...Show moreLast updated: 1 day ago
    • Promoted
    • New!
    Pre-Silicon Design Verification Engineer

    Pre-Silicon Design Verification Engineer

    Advanced Micro DevicesPenangMalaysia, Penang, Malaysia
    WHAT YOU DO AT AMD CHANGES EVERYTHING.At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded syst...Show moreLast updated: 13 hours ago