Onlano
AI / Machine LearningOnsite

Freelance Agent Evaluation Engineer

Mindrift - Company

New South Wales, Australia, ๐Ÿ‡ฆ๐Ÿ‡บ AustraliaFreelance - Mid level (2-4 years)0 applicantsCloses Jun 14, 2026

Salary

CHECK DESCRIPTION

Apply for this job

Job description

Role overview

Mindrift connects specialists with project-based AI opportunities for leading tech companies, focusing on testing and improving AI systems. This freelance role involves building datasets to evaluate AI coding agents' performance on real-world developer tasks.

You'll design challenging technical tasks and create evaluation criteria to measure how effectively AI models handle software development challenges. The position is project-based and does not include permanent employment.

Responsibilities

  • Design and implement evaluation datasets for AI coding agents
  • Create realistic software development tasks to test AI capabilities
  • Develop scoring criteria to measure AI model performance
  • Collaborate with AI teams to improve evaluation methodologies

Requirements

  • 2+ years experience in AI/ML testing or software development
  • Strong understanding of coding workflows and software development practices
  • Proficiency in English (written/verbal communication)
  • Experience with AI evaluation frameworks or methodologies

Benefits

  • Project-based work with leading tech companies
  • Contribute to cutting-edge AI evaluation research
  • Flexible freelance arrangement

Keywords

AI evaluationmachine learning testingcoding agentsdataset creationAI systemsdeveloper tasksmodel evaluationEnglish proficiency