Onlano
AI / Machine LearningOnsite

Freelance Agent Evaluation Engineer

Mindrift - Company

Australia, ๐Ÿ‡ฆ๐Ÿ‡บ AustraliaFreelance - Mid level (2-4 years)0 applicantsCloses Jun 14, 2026

Salary

CHECK DESCRIPTION

Apply for this job

Job description

Role overview

Mindrift connects specialists with project-based AI opportunities for leading tech companies. This role focuses on testing, evaluating, and improving AI systems through task creation and dataset development.

Key details

  • Build datasets to assess AI coding agents' performance on real-world developer tasks
  • Design challenging tasks and evaluation criteria
  • Collaborate with AI teams to refine model capabilities
  • Project-based engagement (not permanent employment)

Responsibilities

  • Develop challenging tasks to evaluate AI coding agent performance
  • Design and implement evaluation criteria for real-world developer scenarios
  • Collaborate with AI engineering teams to improve model capabilities

Requirements

  • Proficiency in Python and understanding of AI/ML systems
  • Experience with software development processes and coding best practices
  • Strong analytical skills for evaluating technical performance metrics

Benefits

  • Project-based work with leading tech companies
  • Opportunities to influence AI development standards
  • Flexible engagement structure for specialists

Keywords

AI evaluationmachine learningdataset creationcoding agentsPythonmodel testingdeveloper tasksAI systems