AI / Machine LearningOnsite

Freelance Agent Evaluation Engineer

Mindrift - Company

Australia, 🇦🇺 AustraliaFreelance - Mid level (2-4 years)0 applicantsCloses Jun 14, 2026

Salary

CHECK DESCRIPTION

Apply for this job

Job description

Role overview

Mindrift connects specialists with project-based AI opportunities for leading tech companies. This role focuses on testing, evaluating, and improving AI systems through task creation and dataset development.

Key details

Build datasets to assess AI coding agents' performance on real-world developer tasks
Design challenging tasks and evaluation criteria
Collaborate with AI teams to refine model capabilities
Project-based engagement (not permanent employment)

Responsibilities

Develop challenging tasks to evaluate AI coding agent performance
Design and implement evaluation criteria for real-world developer scenarios
Collaborate with AI engineering teams to improve model capabilities

Requirements

Proficiency in Python and understanding of AI/ML systems
Experience with software development processes and coding best practices
Strong analytical skills for evaluating technical performance metrics

Benefits

Project-based work with leading tech companies
Opportunities to influence AI development standards
Flexible engagement structure for specialists

Keywords

AI evaluationmachine learningdataset creationcoding agentsPythonmodel testingdeveloper tasksAI systems