Onlano
AI / Machine LearningRemote

Freelance Agent Evaluation Engineer

Mindrift - Company

Ontario, Canada, ๐Ÿ‡จ๐Ÿ‡ฆ CanadaFreelance - Mid level (2-4 years)0 applicantsCloses Jun 20, 2026

Salary

CHECK DESCRIPTION

Apply for this job

Job description

Job details

  • Location: Ontario, Canada
  • Work mode: Remote
  • Employment type: Freelance (Not an internship)
  • Salary: Salary details are available in the employer description.

Role overview

Mindrift is seeking a Freelance Agent Evaluation Engineer in Ontario, Canada to work on project-based AI opportunities for leading tech companies. This remote freelance role focuses on testing, evaluating, and improving AI coding agents by creating challenging developer tasks and evaluation criteria. The position involves building datasets to assess how well AI models handle real-world development scenarios.

Job details

Location: Ontario, Canada (Remote) Type: Freelance, project-based engagement Salary: Details available in employer description

This is a flexible, project-based opportunity rather than permanent employment. Candidates must submit their CV in English and indicate their English proficiency level. The role requires strong technical skills in software development and AI system evaluation.

Responsibilities

  • Create challenging tasks to evaluate AI coding agent performance
  • Develop evaluation criteria for real-world developer scenarios
  • Build and maintain datasets for AI model assessment
  • Test and analyze AI system capabilities and limitations
  • Document evaluation methodologies and findings

Requirements

  • Strong software development background with coding expertise
  • Experience with AI systems, machine learning, or model evaluation
  • Fluent English proficiency (written and spoken)
  • Ability to design realistic developer task scenarios
  • Self-directed work style for project-based engagement

Benefits

  • Flexible project-based schedule
  • Remote work from Ontario
  • Exposure to cutting-edge AI technology
  • Work with leading tech companies

Keywords

AI evaluationmachine learningcoding agentsdataset creationsoftware testingPythonmodel evaluationdeveloper tools