Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist
Evaluation Scenario Writer - AI Agent Testing SpecialistMindrift • Kuala Lumpur, Kuala Lumpur, Malaysia
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift • Kuala Lumpur, Kuala Lumpur, Malaysia
8 hari lalu
Penerangan pekerjaan

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Evaluation Writer Ai • Kuala Lumpur, Kuala Lumpur, Malaysia

    Pekerjaan berkaitan
    AI Evaluator - Cantonese (Chinese) - Malaysia

    AI Evaluator - Cantonese (Chinese) - Malaysia

    CrowdGen by Appen • Selayang Municipal Council, Selayang Municipal Council, Malaysia
    Join CrowdGen as we launch an exciting new AI Voice Interaction Project designed to help improve the way voice assistants understand and respond to users! We’re looking for detail-oriented contribu...Tunjukkan lagi
    Kemas kini terakhir: 8 hari yang lalu • Dinaikkan pangkat
    AI Data Annotation Project Manage

    AI Data Annotation Project Manage

    HoYoverse • Kuala Lumpur, Kuala Lumpur, Malaysia
    AI Data Annotation Project Manage.HoYoverse • Federal Territory of Kuala Lumpur, Malaysia.Own the end-to-end data annotation workflow for text, image, audio, and video datasets to support LLM and m...Tunjukkan lagi
    Kemas kini terakhir: 1 hari yang lalu • Dinaikkan pangkat
    Hybrid ML Engineer - Diffusion & Vision (Remote-Friendly)

    Hybrid ML Engineer - Diffusion & Vision (Remote-Friendly)

    Bjak • Port Klang, Port Klang, Malaysia
    A leading AI company in Malaysia seeks a Machine Learning Engineer to develop cutting-edge generative vision features.You'll customize diffusion models and build large-scale datasets while collabor...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!
    AI-Powered Full-Stack Engineer — Prototyping to Production

    AI-Powered Full-Stack Engineer — Prototyping to Production

    Mindvalley, Inc. • Klang City, Selangor, Malaysia
    A leading innovative technology company based in Malaysia is seeking a Senior Full Stack Engineer focused on AI product development. You will work closely with the Innovation Team to create impactfu...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!
    Senior Full Stack Engineer (AI-Native) - Contract Role

    Senior Full Stack Engineer (AI-Native) - Contract Role

    Mindvalley, Inc. • Klang Municipal Council, Klang Municipal Council, Malaysia
    We’re looking for a Senior Full Stack Engineer who is highly trained in AI — someone who can.This is not a typical engineering role. You will operate like a technical co‑founder, working directly wi...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!
    Entry-Level AI Transformation Analyst

    Entry-Level AI Transformation Analyst

    Avanade • Kuala Lumpur, Kuala Lumpur, Malaysia
    A leading technology consulting firm in Kuala Lumpur is seeking an Entry-Level Analyst to shape transformation stories and communicate impactful strategies to clients. You will lead workshops, desig...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!
    Remote SEO Content Writer for Growth and Impact

    Remote SEO Content Writer for Growth and Impact

    Alphaiotabpo • Shah Alam, Shah Alam, Malaysia
    A dynamic digital marketing firm based in Kuala Lumpur is seeking a content creator with expertise in social media advertising. The role involves researching, writing, and optimizing content while a...Tunjukkan lagi
    Kemas kini terakhir: 5 hari yang lalu • Dinaikkan pangkat
    (WFH) SEO Content Writer

    (WFH) SEO Content Writer

    Alpha Iota BPO • Kuala Lumpur, Kuala Lumpur, Malaysia
    Alpha Iota BPO Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia.Join Our Alpha Iota Family, Where Everyone Wins!. Enjoy exciting work‑from‑home opportunities, learning & development program...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Azure Architect (AI Adoption / Security)

    Azure Architect (AI Adoption / Security)

    Softenger (Malaysia) Sdn Bhd • Shah Alam, Selangor, Malaysia
    Job Title : AI Architect (Adoption / Security).We are hiring for key roles to support a major enterprise‑scale AI transformation program. Candidates will work closely with business and IT teams to driv...Tunjukkan lagi
    Kemas kini terakhir: 4 hari yang lalu • Dinaikkan pangkat
    Senior B2B Content Strategist & Lead Writer

    Senior B2B Content Strategist & Lead Writer

    2X • Kuala Lumpur, Kuala Lumpur, Malaysia
    A leading B2B marketing firm in Kuala Lumpur is seeking a Content Strategist to lead and improve B2B content quality.You will collaborate with teams and develop innovative marketing strategies.Idea...Tunjukkan lagi
    Kemas kini terakhir: 2 hari yang lalu • Dinaikkan pangkat
    Country Manager, Malaysia & Indonesia — AI & Enterprise Sales

    Country Manager, Malaysia & Indonesia — AI & Enterprise Sales

    Proto • Kuala Selangor, Kuala Selangor, Malaysia
    A leading AI solutions provider is seeking a Country Manager to oversee operations in Malaysia and Indonesia.The role involves engaging with prospects, managing a sales pipeline, and leading presen...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!
    (WFH) SEO Content Writer

    (WFH) SEO Content Writer

    Alpha Iota BPO Sdn Bhd • Kuala Lumpur, Malaysia
    Join Our Alpha Iota Family, Where Everyone Wins!.Exciting Work-from-Home Opportunities.Learning & Development Programs to Upskill Yourself. Health and Wellness Perks & Benefits.Motivating and Suppor...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    AI Engineer (Multi-agent system)

    AI Engineer (Multi-agent system)

    Hiredly X • Kuala Lumpur, Kuala Lumpur, Malaysia
    As an AI Engineer (Entry to Mid level), you’ll play a hands-on role in the development of our client's multi-agent AI systems. You’ll implement prompt templates, integrate LLMs, build user-facing fe...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Ads Relevance Specialist - AI Data Service and Operations

    Ads Relevance Specialist - AI Data Service and Operations

    TikTok • Kuala Lumpur, Kuala Lumpur, Malaysia
    Ads Relevance Specialist - AI Data Service and Operations.TikTok, Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. About the team : Our Search Operations team supports our efforts to addres...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Content Specialist

    Content Specialist

    Cartrack • Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
    We are a world-leading smart mobility SaaS company with over 2,000,000 subscribers across 23 countries and we are looking for a Content Specialist to join our team. Our teams are collaborative, vibr...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Business Intelligence & AI Transformation Lead

    Business Intelligence & AI Transformation Lead

    StoreHub Sdn Bhd, OneStoreHub Pte Ltd • Kuala Lumpur, Kuala Lumpur, Malaysia
    Are you driven, results-oriented and a team player?.With 17,000+ retailers and restaurants in over 15 countries, StoreHub is on a mission to enable everyone, big or small, to build successful busin...Tunjukkan lagi
    Kemas kini terakhir: 5 hari yang lalu • Dinaikkan pangkat
    Content Writer

    Content Writer

    Two95 International Inc. • Setia Alam, Selangor, MY
    Quick Apply
    Create and present contents via Youtube Platform.Familiar with Financial Services industry (not limited to fintech, loans, BPL). Work hours 9am-5pm, Monday - Friday.Hiring Immediately - Permanant an...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu
    AI Automation Senior Specialist (Brand & Marketing)

    AI Automation Senior Specialist (Brand & Marketing)

    CelcomDigi • Petaling Jaya, Selangor, Malaysia
    Get AI-powered advice on this job and more exclusive features.Are you a creative and tech-savvy professional ready to shape the future of marketing? We're looking for a CreativeX Senior Specialist ...Tunjukkan lagi
    Kemas kini terakhir: 12 jam yang lalu • Dinaikkan pangkat • Baharu!