Job description
Role overview
Mindrift connects specialists with project-based AI opportunities for leading tech companies. This role focuses on testing, evaluating, and improving AI systems through task creation and dataset development.
Key details
- Build datasets to assess AI coding agents' performance on real-world developer tasks
- Design challenging tasks and evaluation criteria
- Collaborate with AI teams to refine model capabilities
- Project-based engagement (not permanent employment)
Responsibilities
- Develop challenging tasks to evaluate AI coding agent performance
- Design and implement evaluation criteria for real-world developer scenarios
- Collaborate with AI engineering teams to improve model capabilities
Requirements
- Proficiency in Python and understanding of AI/ML systems
- Experience with software development processes and coding best practices
- Strong analytical skills for evaluating technical performance metrics
Benefits
- Project-based work with leading tech companies
- Opportunities to influence AI development standards
- Flexible engagement structure for specialists
Keywords
AI evaluationmachine learningdataset creationcoding agentsPythonmodel testingdeveloper tasksAI systems