Talent.com
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

MindriftGelang Patah, Johor, Malaysia
6 hari lalu
Penerangan pekerjaan

Mindrift is looking for a freelance Agent Scenarios Designer based in the specified country. The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests. You will work on a flexible schedule and receive pay up to $38 / hr based on experience.

What We Do

The Mindrift platform, launched and powered by Toloka, connects domain experts with cutting‑edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real‑world expertise from across the globe.

About the Role

You will design realistic and structured evaluation scenarios, create test cases that simulate human‑performed tasks, and define gold‑standard behavior to compare agent actions against. Your work will ensure each scenario is clearly defined, well‑scored, and easy to execute and reuse. You need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions.

Responsibilities

  • Design structured test scenarios based on real‑world tasks
  • Define the golden path and acceptable agent behavior
  • Annotate task steps, expected outputs, and edge cases
  • Work with developers to test scenarios and improve clarity
  • Review agent outputs and adapt tests accordingly

How to Get Started

Apply to this posting, qualify, and you’ll have the chance to contribute to projects aligned with your skills on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.

Requirements

  • Bachelor’s and / or Master’s degree in Computer Science, Software Engineering, Data Science / Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / NLP, Information Systems or related fields
  • Background in QA, software testing, data analysis, or NLP annotation
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases)
  • Strong written communication skills in English
  • Comfortable with structured formats like JSON / YAML for scenario description
  • Can define expected agent behaviors (gold paths) and scoring logic
  • Basic experience with Python and JavaScript
  • Curious and open to working with AI‑generated content, agent logs, and prompt‑based behavior
  • Ready to learn new methods, able to switch between tasks and topics quickly, and sometimes work with challenging, complex guidelines
  • Fully remote freelance role – only requires a laptop, internet connection, available time, and enthusiasm to take on a challenge
  • Nice to Have

  • Experience in writing manual or automated test cases
  • Familiarity with LLM capabilities and typical failure modes
  • Understanding of scoring metrics (precision, recall, coverage, reward functions)
  • Benefits

  • Get paid for your expertise, with rates up to $38 / hr depending on your skills, experience, and project needs
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments
  • Gain valuable experience to enhance your portfolio through an advanced AI project
  • Influence how future AI models understand and communicate in your field of expertise
  • #J-18808-Ljbffr

    Buat amaran kerja untuk carian ini

    Evaluation Writer Ai • Gelang Patah, Johor, Malaysia

    Pekerjaan berkaitan
    • Dinaikkan pangkat
    Remote Equity Research Associate (Contract) — AI Tools

    Remote Equity Research Associate (Contract) — AI Tools

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    An innovative financial services firm is looking for a Sell-Side Equity Research Analyst / Associate to lead sector coverage and develop financial models. This remote position requires 2+ years of r...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    Remote FP&A Analyst for AI-Driven Forecasting

    Remote FP&A Analyst for AI-Driven Forecasting

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    A leading finance-focused company is seeking a Finance Professional for a part-time, independent contractor role.The position involves evaluating financial outputs from AI systems and collaborating...Tunjukkan lagiKemas kini terakhir: 2 hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Senior Data & Machine Learning Engineer

    Senior Data & Machine Learning Engineer

    HIOKI SINGAPORE PTE. LTD.Serangoon Garden Circus, Southeast, Singapore
    Help build a modern data foundation that turns high‑rate, time‑indexed information into reliable, customer‑facing insight across workstation and distributed environments. You’ll sit at the intersect...Tunjukkan lagiKemas kini terakhir: 2 jam yang lalu
    • Dinaikkan pangkat
    Remote AI Task Evaluation & Analytics Specialist

    Remote AI Task Evaluation & Analytics Specialist

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    A leading AI research consulting firm is seeking a part-time AI Task Evaluation & Statistical Analysis Specialist to conduct statistical failure analysis and recommend design improvements based on ...Tunjukkan lagiKemas kini terakhir: 4 hari yang lalu
    • Dinaikkan pangkat
    Freelance AI Agent Assistant

    Freelance AI Agent Assistant

    MindriftJohor Bahru, Johor, Malaysia
    At Mindrift, innovation meets opportunity.We believe in using the power of collective intelligence to ethically shape the future of AI. Mindrift is looking for passionate freelance contributors to j...Tunjukkan lagiKemas kini terakhir: 30+ hari yang lalu
    • Dinaikkan pangkat
    Remote SEO Content Writer for Growth and Impact

    Remote SEO Content Writer for Growth and Impact

    AlphaiotabpoKulai, Johor, Malaysia
    A dynamic digital marketing firm based in Kuala Lumpur is seeking a content creator with expertise in social media advertising. The role involves researching, writing, and optimizing content while a...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    Remote Equity Research Analyst — AI Valuation Contract

    Remote Equity Research Analyst — AI Valuation Contract

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    A reputable financial services firm is seeking a Buy-Side Equity Analyst to conduct equity research focused on AI and emerging technologies. The role involves building financial models and collabora...Tunjukkan lagiKemas kini terakhir: 4 hari yang lalu
    • Dinaikkan pangkat
    Generative AI Engineer — Vision & NLP (Remote)

    Generative AI Engineer — Vision & NLP (Remote)

    LinhWorkFromHome, Singapore, Singapore
    Jauh
    A leading Generative AI Start-up is seeking an ML Engineer to join their dynamic engineering team in Singapore.This role involves developing AI solutions and collaborating on machine learning proje...Tunjukkan lagiKemas kini terakhir: 4 hari yang lalu
    • Dinaikkan pangkat
    AI Data Specialist - Chinese

    AI Data Specialist - Chinese

    RWS GroupKulai, Johor, Malaysia
    AI Data Specialist - Chinese (Remote).We are looking for an AI Data Specialist to support the improvement of AI-generated content in English. This is a freelance, part‑time role based remotely with ...Tunjukkan lagiKemas kini terakhir: 22 hari yang lalu
    • Dinaikkan pangkat
    AI Finance Tutor - Quantitative Specialist (Remote)

    AI Finance Tutor - Quantitative Specialist (Remote)

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    A leading AI research company is seeking an AI Tutor – Quantitative Finance Specialist.The role involves using proprietary tools for financial data evaluation and collaborating on quantitative fina...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    Azure Architect (AI Adoption / Security)

    Azure Architect (AI Adoption / Security)

    Softenger (Malaysia) Sdn BhdPasir Gudang, Johor, Malaysia
    Job Title : AI Architect (Adoption / Security).We are hiring for key roles to support a major enterprise‑scale AI transformation program. Candidates will work closely with business and IT teams to driv...Tunjukkan lagiKemas kini terakhir: 2 hari yang lalu
    • Dinaikkan pangkat
    Remote Agentic AI Product Leader for Fraud Platform

    Remote Agentic AI Product Leader for Fraud Platform

    Pixalate, IncWorkFromHome, Singapore, Singapore
    Jauh
    A tech company focusing on AI solutions is seeking a Principle Product Manager for their AI Research Lab in Singapore.You will drive the AI fraud detection platform and work alongside PhD-level eng...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    AI Task Assessment Specialist

    AI Task Assessment Specialist

    MercorWorkFromHome, Singapore, Singapore
    Mercor connects elite creative and technical talent with leading AI research labs.Headquartered in San Francisco, our investors include Benchmark, General Catalyst, Peter Thiel, Adam D'Angelo, Larr...Tunjukkan lagiKemas kini terakhir: 10 hari yang lalu
    • Dinaikkan pangkat
    Evaluation Scenario Writer - AI Agent Testing Specialist

    Evaluation Scenario Writer - AI Agent Testing Specialist

    MindriftKulai, Johor, Malaysia
    Mindrift is looking for a freelance.The role focuses on designing realistic and structured evaluation scenarios for LLM‑based agents, testing agent outputs, and refining tests.You will work on a fl...Tunjukkan lagiKemas kini terakhir: 6 hari yang lalu
    • Dinaikkan pangkat
    • Baharu!
    Senior Control Systems Research Engineer - Power & AI Infra

    Senior Control Systems Research Engineer - Power & AI Infra

    LITE-ON SINGAPORE PTE. LTD.Serangoon Garden Circus, Southeast, Singapore
    A leading technology firm in Singapore seeks a highly experienced researcher to join its Cloud Infrastructure Platform and Solutions Research Team. The ideal candidate will hold a PhD or Master’s in...Tunjukkan lagiKemas kini terakhir: 2 jam yang lalu
    • Dinaikkan pangkat
    Global BOSD Analyst : Strategy & Ops, AI Enablement (Remote)

    Global BOSD Analyst : Strategy & Ops, AI Enablement (Remote)

    Wellington Management Company, LLPWorkFromHome, Singapore, Singapore
    Jauh
    A leading investment management firm in Singapore is seeking a Business Operations & Strategic Delivery Analyst.The role involves overseeing platform operations, sales planning, and strategic relat...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    AI Developer

    AI Developer

    BJAKWorkFromHome, Singapore, Singapore
    Shape AI That Powers the Future of Financial Access Across Southeast Asia.At BJAK, we’re using AI to solve meaningful problems - from fraud detection and risk modeling to personalized experiences t...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu
    • Dinaikkan pangkat
    Remote Equity Research Associate — AI‑Driven, Part‑Time

    Remote Equity Research Associate — AI‑Driven, Part‑Time

    MercorWorkFromHome, Singapore, Singapore
    Jauh
    A financial services company seeks a Sell-Side Equity Research Analyst / Associate to lead market research and financial modeling. The position offers remote work and requires 2+ years of experience...Tunjukkan lagiKemas kini terakhir: 3 hari yang lalu