Talent.com
This job offer is not available in your country.
AI Agent Evaluation Analyst

AI Agent Evaluation Analyst

MindriftMY
2 days ago
Job type
  • Quick Apply
Job description

This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.

At Mindrift , innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI.

What we do

The Mindrift platform, launched and powered by Toloka , connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe.

Who we're looking for :

We’re looking for curious and intellectually proactive contributors, the kind of person who double-checks assumptions and plays devil’s advocate.

Are you comfortable with ambiguity and complexity? Does an async, remote, flexible opportunity sound exciting? Would you like to learn how modern AI systems are tested and evaluated?

This is a flexible, project-based opportunity well-suited for :

  • Analysts, researchers, or consultants with strong critical thinking skills.
  • Students (senior undergrads / grad students) looking for an intellectually interesting gig.
  • People open to a part-time and non-permanent opportunity.

About the project :

We’re on the hunt for QAs for autonomous AI agents for a new project focused on validating and improving complex task structures, policy logic, and agent evaluation frameworks. Throughout the project, you’ll have to balance quality assurance, research, and logical problem-solving. This project opportunity is ideal for people who enjoy looking at systems holistically and thinking through scenarios, implications, and edge cases.

You do not need a coding background, but you must be curious, intellectually rigorous, and capable of evaluating the soundness and consistency of complex setups. If you’ve ever excelled in things like consulting, CHGK, Olympiads, case solving, or systems thinking — you might be a great fit.

What you’ll be doing :

  • Reviewing evaluation tasks and scenarios for logic, completeness, and realism.
  • Identifying inconsistencies, missing assumptions, or unclear decision points.
  • Helping define clear expected behaviors (gold standards) for AI agents.
  • Annotating cause-effect relationships, reasoning paths, and plausible alternatives.
  • Thinking through complex systems and policies as a human would to ensure agents are tested properly.
  • Working closely with QA, writers, or developers to suggest refinements or edge case coverage.
  • How to get started :

    Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.

    Requirements

  • Excellent analytical thinking : Can reason about complex systems, scenarios, and logical implications.
  • Strong attention to detail : Can spot contradictions, ambiguities, and vague requirements.
  • Familiarity with structured data formats : Can read, not necessarily write JSON / YAML.
  • Ability to assess scenarios holistically : What's missing, what’s unrealistic, what might break?
  • Good communication and clear writing (in English) to document your findings.
  • We also value applicants who have :

  • Experience with policy evaluation, logic puzzles, case studies, or structured scenario design.
  • Background in consulting, academia, olympiads (e.g. logic / math / informatics), or research.
  • Exposure to LLMs, prompt engineering, or AI-generated content.
  • Familiarity with QA or test-case thinking (edge cases, failure modes, “what could go wrong”).
  • Some understanding of how scoring or evaluation works in agent testing (precision, coverage, etc.).
  • Benefits

  • Get paid for your expertise, with  rates that can go up to $38 / hour  depending on your skills, experience, and project needs.
  • Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.
  • Participate in an advanced AI project and gain valuable experience to enhance your portfolio.
  • Influence how future AI models understand and communicate in your field of expertise.
  • Create a job alert for this search

    Analyst • MY

    Related jobs
    • Promoted
    SQL developer

    SQL developer

    Jaish Global Tech Private LimitedMalaysia, Malaysia
    Get AI-powered advice on this job and more exclusive features.Jaish Global Tech Private Limited provided pay range.This range is provided by Jaish Global Tech Private Limited.Your actual pay will b...Show moreLast updated: 30+ days ago
    • Promoted
    Senior Analyst, IS Applications

    Senior Analyst, IS Applications

    Analog Devices, Inc.Malaysia, Malaysia
    Senior Analyst, IS Applications page is loaded.Senior Analyst, IS Applications.Apply locations Malaysia, Penang Malaysia, Home Office time type Full time posted on Posted Yesterday job requisition ...Show moreLast updated: 5 days ago
    • Promoted
    AML Developer / Support

    AML Developer / Support

    ConfidentialMalaysia
    Junior AML Environment / Support Developer.Compliance Technology – AML / KYC.We are seeking a motivated and technically skilled. Junior AML Environment / Support Developer.The ideal candidate will have ...Show moreLast updated: 24 days ago
    Freelance Physics QA Reviewer - AI Trainer

    Freelance Physics QA Reviewer - AI Trainer

    MindriftMY
    Quick Apply
    This opportunity is only for candidates currently residing in the specified country.Your location may affect eligibility and rates. Please provide your resume in English and indicate your proficienc...Show moreLast updated: 11 days ago
    • Promoted
    Software Engineer (AI Engineer)

    Software Engineer (AI Engineer)

    DHL GermanyMalaysia, Malaysia
    Software Engineer (AI Engineer).With a global team of 6000+ IT professionals, DHL IT Services.Our offices in Cyberjaya, Prague, and Chennai have earned. At IT Services, we are passionate about techn...Show moreLast updated: 1 day ago
    • Promoted
    Freelance Mathematics QA Reviewer - AI Trainer

    Freelance Mathematics QA Reviewer - AI Trainer

    MindriftMalaysia, Malaysia
    Freelance Mathematics QA Reviewer - AI Trainer.This opportunity is for candidates residing in the specified country.Your location may affect eligibility and rates. Please provide your resume in Engl...Show moreLast updated: 2 days ago
    Data Engineer

    Data Engineer

    WhiteCoatMalaysia, 14, MY
    WhiteCoat is a Singapore-headquartered omnichannel provider of integrated health and wellness services that serves as the first and single touchpoint for all care needs in Southeast Asia.Since laun...Show moreLast updated: 30+ days ago
    • Promoted
    Data Visualizer (DigDash / BI Specialist)

    Data Visualizer (DigDash / BI Specialist)

    ConfidentialMalaysia
    Hiring : Data Visualizer (DigDash / BI Specialist).Location : Malaysia (Onsite, Visa Sponsored).Build interactive dashboards / reports in DigDash Enterprise. Troubleshoot DigDash processes & report dist...Show moreLast updated: 24 days ago
    • Promoted
    Senior BI Analyst

    Senior BI Analyst

    MoneySmart GroupMalaysia, Malaysia
    Direct message the job poster from MoneySmart Group.MoneySmart Group is South East Asia's leading personal financial portal helping consumers compare loans, insurance and credit cards.MoneySmart wa...Show moreLast updated: 5 days ago
    • Promoted
    Senior Executive, Performance & Analytics

    Senior Executive, Performance & Analytics

    KPI MediaMalaysia, Malaysia
    Senior Performance & Analytics Executive (Remote, Full-time).Have experience with Paid Media and a love for analytics? Enjoy turning campaign data into insights and helping ads smash KPIs? Thrive i...Show moreLast updated: 2 days ago
    • Promoted
    AP Analyst / Associate

    AP Analyst / Associate

    Roche Services (Asia Pacific) Sdn BhdMalaysia, Malaysia
    At Roche you can show up as yourself, embraced for the unique qualities you bring.Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted ...Show moreLast updated: 2 days ago
    • Promoted
    Data Visualizer

    Data Visualizer

    ConfidentialMalaysia
    Job Description – Data Visualizer (Developer Profile).The IT Lines extension in Kuala Lumpur will work closely with business support areas and IT headquarters in. This role will collaborate with the...Show moreLast updated: 24 days ago
    • Promoted
    IIB Developer

    IIB Developer

    ConfidentialMalaysia
    We're Hiring – IIB Developer / Senior IIB Developer ????.Location : Kuala Lumpur, Malaysia.We're looking for talented professionals to design & develop integration solutions using IBM Integration Bu...Show moreLast updated: 24 days ago
    • Promoted
    Freelance Cybersecurity Engineer with AI / ML Focus

    Freelance Cybersecurity Engineer with AI / ML Focus

    MindriftMalaysia, Malaysia
    Freelance Cybersecurity Engineer with AI / ML Focus.This opportunity is remote and freelance.Candidates should reside in the specified country and submit a resume in English indicating their level of...Show moreLast updated: 2 days ago
    • Promoted
    Senior, Technical & Training (Quality Assurance Review)

    Senior, Technical & Training (Quality Assurance Review)

    Crowe Malaysia PLTMalaysia, Malaysia
    Conduct research related to accounting & auditing standards.Assist in performing quality assurance review on audit engagement files. Assist in preparing and updating audit working paper templates & ...Show moreLast updated: 1 day ago
    • Promoted
    Bigdata developer

    Bigdata developer

    ConfidentialMalaysia
    Proficient in SQL language, familiar with MySQL, very familiar with Shell, Java (Scala), Python;.Familiar with common ETL technologies and principles. proficient in data warehouse database design s...Show moreLast updated: 24 days ago
    • Promoted
    Talend ETL Expert ��

    Talend ETL Expert ��

    ConfidentialMalaysia
    We're Hiring – Talend ETL Expert ????.ETL tools (Talend, Talend Big Data Platform).PySpark or Scala jobs preferred).Additional Skills (Nice to Have). Git, Jenkins, Docker, or other DevOps tools.Etl,...Show moreLast updated: 24 days ago
    Freelance AI Red Team Engineer

    Freelance AI Red Team Engineer

    MindriftMY
    Quick Apply
    This opportunity is only for candidates currently residing in the specified country.Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of En...Show moreLast updated: 13 days ago