AI / Machine LearningOnsite

Freelance Agent Evaluation Engineer

Mindrift - Company

New South Wales, Australia, 🇦🇺 AustraliaFreelance - Mid level (2-4 years)0 applicantsCloses Jun 14, 2026

Salary

CHECK DESCRIPTION

Apply for this job

Job description

Role overview

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focusing on testing and improving AI systems. This freelance role involves building datasets to evaluate AI coding agents' performance on real-world developer tasks.

You'll design challenging technical tasks and create evaluation criteria to measure how effectively AI models handle software development challenges. The position is project-based and does not include permanent employment.

Responsibilities

Design and implement evaluation datasets for AI coding agents
Create realistic software development tasks to test AI capabilities
Develop scoring criteria to measure AI model performance
Collaborate with AI teams to improve evaluation methodologies

Requirements

2+ years experience in AI/ML testing or software development
Strong understanding of coding workflows and software development practices
Proficiency in English (written/verbal communication)
Experience with AI evaluation frameworks or methodologies

Benefits

Project-based work with leading tech companies
Contribute to cutting-edge AI evaluation research
Flexible freelance arrangement

Keywords

AI evaluationmachine learning testingcoding agentsdataset creationAI systemsdeveloper tasksmodel evaluationEnglish proficiency