Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

MindriftMalacca City, Malacca, Malaysia
5 days ago
Job description

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Create a job alert for this search

    Evaluation Writer Ai • Malacca City, Malacca, Malaysia

    Related jobs
    • Promoted
    Senior UA & Growth Lead - Remote & Flexible Hours

    Senior UA & Growth Lead - Remote & Flexible Hours

    Pixlr GroupSeremban, Negeri Sembilan, Malaysia
    A leading photo editing platform company is seeking a hands-on Senior User Acquisition Manager to scale its subscriber base and drive sustainable growth. The role involves managing campaigns across ...Show moreLast updated: 1 day ago
    • Promoted
    Luo Language Specialist - AI Trainer

    Luo Language Specialist - AI Trainer

    Invisible Expert MarketplacePasir Panjang, Negeri Sembilan, Malaysia
    Join to apply for the Luo Language Specialist - AI Trainer role at Invisible Expert Marketplace.Are you an experienced Luo language professional eager to shape the future of AI? Large-scale languag...Show moreLast updated: 20 days ago
    • Promoted
    Freelance Data Science Content Creator

    Freelance Data Science Content Creator

    ACCASeremban, Negeri Sembilan, Malaysia
    ACCA seeks experienced Data Science professionals to create high-quality learning content for the new Data Science offerings. As a Freelance Content Creator, you will develop engaging, pedagogically...Show moreLast updated: 1 day ago
    • Promoted
    Remote SEO & ASO Specialist

    Remote SEO & ASO Specialist

    FreelancingSeremban, Negeri Sembilan, Malaysia
    We are looking for an experienced.Conduct keyword research, competitor analysis, and performance tracking.Collaborate with content, marketing, and product teams to enhance digital presence.Google a...Show moreLast updated: 3 days ago
    • Promoted
    Lead Talent Acquisition : Build World-Class Teams (Remote)

    Lead Talent Acquisition : Build World-Class Teams (Remote)

    BjakPasir Panjang, Negeri Sembilan, Malaysia
    A financial services platform in Malaysia seeks a Lead, Talent Acquisition to drive recruitment efforts.This role involves managing the full-cycle hiring process and ensuring a positive candidate e...Show moreLast updated: 1 day ago
    • Promoted
    Testing Engineer

    Testing Engineer

    ContinentalSeremban, Negeri Sembilan, Malaysia
    Coordinating with product design teams on product prototypes and testing parameters.Testing environments and cases for software or hardware systems. Analyzing and troubleshooting issues in testing p...Show moreLast updated: 5 days ago
    • Promoted
    Azure Architect (AI Adoption / Security)

    Azure Architect (AI Adoption / Security)

    Softenger (Malaysia) Sdn BhdMuar, Johor, Malaysia
    Job Title : AI Architect (Adoption / Security).We are hiring for key roles to support a major enterprise‑scale AI transformation program. Candidates will work closely with business and IT teams to driv...Show moreLast updated: 1 day ago
    • Promoted
    (WFH) Content Strategist

    (WFH) Content Strategist

    Alpha Iota BPO Sdn BhdPasir Panjang, Negeri Sembilan, Malaysia
    Join Our Alpha Iota Family, Where Everyone Wins!.Exciting Work-from-Home Opportunities.Learning & Development Programs to Upskill Yourself. Health and Wellness Perks & Benefits.Motivating and Suppor...Show moreLast updated: 1 day ago
    • Promoted
    Senior Architect – AI-Powered Integrations (Hybrid / Remote)

    Senior Architect – AI-Powered Integrations (Hybrid / Remote)

    F-Secure CorporationMalacca City, Malacca, Malaysia
    A cybersecurity company is seeking a Senior Architect to lead the architectural design of customer-specific integrations. Responsibilities include driving standardization, enhancing delivery through...Show moreLast updated: 2 days ago
    • Promoted
    Remote Technical Project Manager - Blockchain Security

    Remote Technical Project Manager - Blockchain Security

    PlaceholderPasir Panjang, Negeri Sembilan, Malaysia
    A leading technology firm is seeking an experienced Technical Project Manager in Shah Alam, Malaysia.You will manage client relationships, ensure project delivery, and work with cutting-edge blockc...Show moreLast updated: 1 day ago
    • Promoted
    AI Delivery Project Lead — Remote & Growth Path

    AI Delivery Project Lead — Remote & Growth Path

    CheminSeremban, Negeri Sembilan, Malaysia
    An innovative AI company in Kuala Lumpur seeks a Project Management Executive to support AI data labeling projects.This entry-level position offers hands-on experience across the project lifecycle ...Show moreLast updated: 1 day ago
    • Promoted
    AI Data Specialist - Chinese

    AI Data Specialist - Chinese

    RWS GroupSeremban, Negeri Sembilan, Malaysia
    AI Data Specialist - Chinese (Remote).We are looking for an AI Data Specialist to support the improvement of AI-generated content in English. This is a freelance, part‑time role based remotely with ...Show moreLast updated: 21 days ago
    • Promoted
    Automation Developer : Power Platform & AI Solutions

    Automation Developer : Power Platform & AI Solutions

    SRKK GroupKebun Baharu, Johor, Malaysia
    A leading IT services company in Johor, Malaysia, is seeking an Automation Developer to deliver end-to-end solutions utilizing Microsoft technologies. Candidates should have at least one year of dev...Show moreLast updated: 1 day ago
    • Promoted
    Remote ServiceNow AI Trainer : Shape Next-Gen Automation

    Remote ServiceNow AI Trainer : Shape Next-Gen Automation

    MindriftPasir Panjang, Negeri Sembilan, Malaysia
    A technology solutions company is looking for a Freelance ServiceNow Consultant to join as an AI Trainer.This remote role involves transforming intents into agent steps, defining dialogue flows, an...Show moreLast updated: 2 days ago
    • Promoted
    Project Manager - Remote, Cantonese Speaker ( Mobile Apps / Web / AI Solution)

    Project Manager - Remote, Cantonese Speaker ( Mobile Apps / Web / AI Solution)

    REDSO INNOVATION SDN. BHD.Muar, Johor, Malaysia
    This is an exciting opportunity to join REDSO INNOVATION SDN.Project Manager (Mobile Apps / Web / AI Solution).In this full-time fully remote role, you will be responsible for leading the successfu...Show moreLast updated: 30+ days ago
    • Promoted
    Remote SEO Content Writer for Growth and Impact

    Remote SEO Content Writer for Growth and Impact

    AlphaiotabpoSeremban, Negeri Sembilan, Malaysia
    A dynamic digital marketing firm based in Kuala Lumpur is seeking a content creator with expertise in social media advertising. The role involves researching, writing, and optimizing content while a...Show moreLast updated: 2 days ago
    • Promoted
    Freelance Medical Content Writer

    Freelance Medical Content Writer

    DigitalPasir Panjang, Negeri Sembilan, Malaysia
    Hire Digital is seeking a remote.Freelance Medical Content Writer.The ideal candidate has a solid background in medical or pharmaceutical writing and can translate complex topics into clear and eng...Show moreLast updated: 5 days ago
    • Promoted
    Freelance Data Annotator with Japanese - AI Trainer

    Freelance Data Annotator with Japanese - AI Trainer

    Toloka AnnotatorsPasir Panjang, Negeri Sembilan, Malaysia
    Freelance Data Annotator (Japanese) – AI Trainer.Get AI-powered advice on this job and more exclusive features.This opportunity is only for candidates currently residing in the specified country.Yo...Show moreLast updated: 5 days ago