Onlano
AI / Machine LearningRemote

Freelance Agent Evaluation Engineer

Mindrift - Agency

London, UK, ๐Ÿ‡ฌ๐Ÿ‡ง United KingdomFreelance - Mid level (2-4 years)0 applicantsCloses Jul 31, 2026

Salary

GBP 71,783 - 71,783 / year

Apply for this job

Job description

Job details

  • Location: London, UK
  • Work mode: Remote
  • Employment type: Freelance (Not an internship)
  • Salary: GBP 71,783 per year

Role overview

Mindrift is seeking a Freelance Agent Evaluation Engineer to help build high-quality datasets for evaluating AI coding agents. In this role, you will focus on assessing how well AI models handle real-world developer tasks by creating complex scenarios and rigorous evaluation criteria. This is a project-based opportunity designed for specialists who want to contribute to the improvement of next-generation AI systems for leading tech companies.

Job details

This is a Freelance position based in London, UK, operating on a Remote basis. This role is not an Internship. The compensation for this project-based engagement is 71,783 GBP.

Responsibilities

  • Develop challenging real-world developer tasks to test AI coding agents
  • Define clear and objective evaluation criteria for model responses
  • Analyze AI-generated code for accuracy, efficiency, and security
  • Collaborate with AI researchers to refine dataset quality
  • Provide detailed feedback on model performance and failure points

Requirements

  • Proven experience in software development and coding best practices
  • Strong proficiency in English, both written and verbal
  • Ability to create complex technical test cases and benchmarks
  • Analytical mindset with a focus on edge-case detection
  • Experience with AI models or LLM evaluation is highly preferred

Benefits

  • Flexible remote work environment
  • Opportunity to work with leading global tech companies
  • Engagement in cutting-edge AI development projects

Keywords

AI EvaluationLLM TestingCoding AgentsDataset CreationSoftware EngineeringPrompt EngineeringQuality Assurance