AI / Machine LearningRemote

Freelance Agent Evaluation Engineer

Mindrift - Agency

London, UK, 🇬🇧 United KingdomFreelance - Mid level (2-4 years)0 applicantsCloses Jul 31, 2026

Salary

GBP 71,783 - 71,783 / year

Apply for this job

Job description

Job details

Location: London, UK
Work mode: Remote
Employment type: Freelance (Not an internship)
Salary: GBP 71,783 per year

Role overview

Mindrift is seeking a Freelance Agent Evaluation Engineer to help build high-quality datasets for evaluating AI coding agents. In this role, you will focus on assessing how well AI models handle real-world developer tasks by creating complex scenarios and rigorous evaluation criteria. This is a project-based opportunity designed for specialists who want to contribute to the improvement of next-generation AI systems for leading tech companies.

Job details

This is a Freelance position based in London, UK, operating on a Remote basis. This role is not an Internship. The compensation for this project-based engagement is 71,783 GBP.

Responsibilities

Develop challenging real-world developer tasks to test AI coding agents
Define clear and objective evaluation criteria for model responses
Analyze AI-generated code for accuracy, efficiency, and security
Collaborate with AI researchers to refine dataset quality
Provide detailed feedback on model performance and failure points

Requirements

Proven experience in software development and coding best practices
Strong proficiency in English, both written and verbal
Ability to create complex technical test cases and benchmarks
Analytical mindset with a focus on edge-case detection
Experience with AI models or LLM evaluation is highly preferred

Benefits

Flexible remote work environment
Opportunity to work with leading global tech companies
Engagement in cutting-edge AI development projects

Keywords

AI EvaluationLLM TestingCoding AgentsDataset CreationSoftware EngineeringPrompt EngineeringQuality Assurance