Job description
Job details
- Location: Glasgow, Scotland
- Work mode: Remote
- Employment type: Freelance (Not an internship)
- Salary: GBP 66,355 per year
Role overview
Mindrift is seeking a Freelance Agent Evaluation Engineer to help build high-quality datasets for evaluating AI coding agents. In this role, you will design challenging real-world developer tasks and establish rigorous evaluation criteria to measure how effectively AI models handle complex software engineering problems. This is a project-based opportunity ideal for specialists looking to influence the next generation of AI tools.
Job details
This is a Freelance, project-based position located in Glasgow, Scotland. The role is Remote and is not an Internship. The compensation for this engagement is 66,355 GBP.
Responsibilities
- Create complex, real-world developer tasks to test AI coding agent capabilities.
- Develop detailed evaluation criteria and benchmarks for model performance.
- Analyze AI-generated code for accuracy, efficiency, and security.
- Collaborate with tech teams to refine dataset quality and diversity.
- Provide expert feedback on model behavior to improve AI reasoning.
Requirements
- Strong proficiency in software development and multiple programming languages.
- Experience in testing, evaluating, or fine-tuning AI systems or LLMs.
- Ability to design edge-case scenarios that challenge AI coding logic.
- Professional level of English proficiency for documentation and reporting.
- Proven track record of delivering high-quality technical work independently.
Benefits
- Flexible project-based work schedule.
- Opportunity to work with leading global tech companies.
- Contribution to cutting-edge AI development.
- Competitive freelance compensation.
Keywords
AI EvaluationLLMCoding AgentsDataset CreationSoftware EngineeringPrompt EngineeringQuality Assurance