Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist
Evaluation Scenario Writer - AI Agent Testing SpecialistMindrift • Port Klang, Port Klang, Malaysia
Tidak lagi menerima permohonan
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift • Port Klang, Port Klang, Malaysia
17 hari lalu
Penerangan pekerjaan

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Evaluation Writer Ai • Port Klang, Port Klang, Malaysia

    Pekerjaan berkaitan
    Freelance AI Agent Assistant

    Freelance AI Agent Assistant

    Mindrift • Klang City, Selangor, Malaysia
    At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI. Mindrift is looking for passionate freelance contributors to j...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Oromo Language Specialist - AI Trainer

    Oromo Language Specialist - AI Trainer

    Invisible Expert Marketplace • Klang City, Selangor, Malaysia
    Oromo Language Specialist – AI Trainer.Join to apply for the Oromo Language Specialist – AI Trainer role at Invisible Expert Marketplace. Large-scale language models are evolving rapidly.With high-q...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Fulah Language Specialist - AI Trainer

    Fulah Language Specialist - AI Trainer

    Invisible Expert Marketplace • Klang City, Selangor, Malaysia
    Fulah Language Specialist - AI Trainer.Are you an experienced Fulah language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving beyond simple cha...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Analytical Chemistry Specialist - AI Trainer

    Analytical Chemistry Specialist - AI Trainer

    Invisible Expert Marketplace • Port Klang, Port Klang, Malaysia
    Analytical Chemistry Specialist – AI Trainer.Analytical Chemistry Specialist – AI Trainer.Are you an analytical chemistry expert eager to shape the future of AI? Large‑scale language models are evo...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Senior Full Stack Engineer (AI-Native) - Contract Role

    Senior Full Stack Engineer (AI-Native) - Contract Role

    Mindvalley, Inc. • Klang City, Selangor, Malaysia
    We’re looking for a Senior Full Stack Engineer who is highly trained in AI — someone who can.This is not a typical engineering role. You will operate like a technical co‑founder, working directly wi...Tunjukkan lagi
    Kemas kini terakhir: 10 hari yang lalu • Dinaikkan pangkat
    Remote-First Senior Backend Architect for AI Marketing

    Remote-First Senior Backend Architect for AI Marketing

    Needle • Klang City, Selangor, Malaysia
    A cutting-edge marketing technology company is seeking a Senior Backend Engineer to architect their backend systems for an AI marketing engine. The ideal candidate will have experience with Python, ...Tunjukkan lagi
    Kemas kini terakhir: 1 hari yang lalu • Dinaikkan pangkat
    AI-Powered Full-Stack Engineer — Prototyping to Production

    AI-Powered Full-Stack Engineer — Prototyping to Production

    Mindvalley, Inc. • Klang Municipal Council, Klang Municipal Council, Malaysia
    A leading innovative technology company based in Malaysia is seeking a Senior Full Stack Engineer focused on AI product development. You will work closely with the Innovation Team to create impactfu...Tunjukkan lagi
    Kemas kini terakhir: 10 hari yang lalu • Dinaikkan pangkat
    Kabuverdianu Language Specialist - AI Trainer

    Kabuverdianu Language Specialist - AI Trainer

    Invisible Expert Marketplace • Klang City, Selangor, Malaysia
    Kabuverdianu Language Specialist - AI Trainer.Are you an experienced Kabuverdianu language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving bey...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Cantonese Language Specialist - AI Trainer

    Cantonese Language Specialist - AI Trainer

    Invisible Expert Marketplace • Klang Municipal Council, Klang Municipal Council, Malaysia
    Join to apply for the Cantonese Language Specialist - AI Trainer role at Invisible Expert Marketplace.Review and annotate Cantonese content, assess AI-generated outputs for accuracy and fluency, id...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    AI QA Trainer – LLM Evaluation

    AI QA Trainer – LLM Evaluation

    Invisible Expert Marketplace • Klang Municipal Council, Klang Municipal Council, Malaysia
    AI QA Trainer – LLM Evaluation.Get AI-powered advice on this job and more exclusive features.Are you an AI QA expert eager to shape the future of AI? Large-scale language models are evolving from c...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Wolof Language Specialist - AI Trainer

    Wolof Language Specialist - AI Trainer

    Invisible Expert Marketplace • Port Klang, Port Klang, Malaysia
    Wolof Language Specialist – AI Trainer.AI models for Wolof speakers worldwide.Review and annotate Wolof content for training datasets. Evaluate AI-generated outputs for accuracy, fluency, and cultur...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Search Engine Evaluator

    Search Engine Evaluator

    OneForma • Klang Municipal Council, Klang Municipal Council, Malaysia
    We are currently looking for long-term participants for our new project,.During this Search Engine Evaluation project, participants will evaluate the quality of queries made by internet users while...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Remote AI Agent Trainer—Freelance, Part-Time

    Remote AI Agent Trainer—Freelance, Part-Time

    Mindrift • Port Klang, Port Klang, Malaysia
    A leading AI technology company in Malaysia is seeking an AI Agent Assistant to contribute to the Tendem project.This part-time, remote role involves collaborating with large language models on tas...Tunjukkan lagi
    Kemas kini terakhir: 3 hari yang lalu • Dinaikkan pangkat
    Freelance AI Agent Trainer

    Freelance AI Agent Trainer

    Mindrift • Klang Municipal Council, Klang Municipal Council, Malaysia
    At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI. The Mindrift platform connects specialists with AI projects fr...Tunjukkan lagi
    Kemas kini terakhir: 3 hari yang lalu • Dinaikkan pangkat
    Umbundu Language Specialist - AI Trainer

    Umbundu Language Specialist - AI Trainer

    Invisible Expert Marketplace • Klang Municipal Council, Klang Municipal Council, Malaysia
    Umbundu Language Specialist - AI Trainer.Are you an experienced Umbundu language professional eager to shape the future of AI? Large‑scale language models are evolving rapidly, moving beyond simple...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    AI Project

    AI Project

    Freelancing • Klang Municipal Council, Klang Municipal Council, Malaysia
    Develab is an IT consulting company operating in Malaysia, Singapore and Indonesia.We continuously seek innovation with a mission to help businesses realize their dreams with quality digital soluti...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat
    Senior Backend Engineer - AI Model Serving (Remote)

    Senior Backend Engineer - AI Model Serving (Remote)

    Speechify • Klang Municipal Council, Klang Municipal Council, Malaysia
    A leading tech company in Penang is looking for a Senior Backend Engineer to join their AI team.You will develop voice cloning technologies and implement efficient text-to-speech solutions.Ideal ca...Tunjukkan lagi
    Kemas kini terakhir: 2 hari yang lalu • Dinaikkan pangkat
    Technical Author (multiple roles and seniority levels)

    Technical Author (multiple roles and seniority levels)

    Canonical • Port Klang, Port Klang, Malaysia
    Technical Author (multiple roles and seniority levels).Canonical Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia. Join or sign in to find your next job.Technical Author (multiple roles and...Tunjukkan lagi
    Kemas kini terakhir: 30+ hari yang lalu • Dinaikkan pangkat