Job description
Job details
- Location: UK
- Work mode: Remote
- Employment type: Freelance (Not an internship)
- Salary: GBP 69,357 per year
Role overview
Mindrift is seeking a Freelance Agent Evaluation Engineer to help build a high-quality dataset for evaluating AI coding agents. In this role, you will focus on testing how AI models handle complex, real-world developer tasks to improve the overall performance of next-generation AI systems.
Job details
This is a Freelance position based in the UK. The role is Remote and is not an Internship. The offered salary is 69,357 GBP. You will work on a project-based basis to create challenging tasks and rigorous evaluation criteria for AI coding agents.
Responsibilities
- Design and create challenging real-world developer tasks to test AI coding agents.
- Develop comprehensive evaluation criteria to measure model accuracy and efficiency.
- Analyze AI-generated code to identify edge cases and failure points.
- Collaborate on the creation of high-quality datasets for AI training and testing.
- Provide detailed feedback on AI model performance to drive system improvements.
Requirements
- Proven experience in software development and coding best practices.
- Strong proficiency in English, both written and verbal.
- Ability to break down complex technical problems into testable tasks.
- Analytical mindset with a focus on quality and precision.
- Experience with AI tools or LLM evaluation is highly preferred.
Benefits
- Flexible remote work environment.
- Opportunity to work with leading tech companies.
- Exposure to cutting-edge AI development and evaluation.
- Competitive project-based compensation.
Keywords
AI EvaluationLLM TestingCoding AgentsSoftware EngineeringDataset CreationPrompt EngineeringQuality Assurance