Job description
Job details
- Location: UK
- Work mode: Remote
- Employment type: Full-time (Not an internship)
- Salary: GBP 77,112 per year
Role overview
Runware is seeking a Staff Software Engineer to lead the technical strategy for latency, throughput, and reliability within our AI inference platform. In this senior leadership position, you will optimize the entire pipeline from request ingress to GPU execution, ensuring high-performance delivery of AI models at scale.
Job details
This is a Full-time position based in the UK. The role is Remote, offering flexibility for a high-impact technical leader. This is not an Internship. The annual salary for this role is 77,112 GBP.
Responsibilities
- Take full technical ownership of latency and throughput across the AI inference platform
- Architect and implement systems to achieve sub-one-second inference times in production
- Define technical standards and execution roadmaps for GPU execution and result delivery
- Optimize request ingress and data flow to maximize platform reliability and scale
- Lead the design of high-performance infrastructure to support massive AI workloads
Requirements
- Extensive experience in high-performance software engineering and system architecture
- Proven track record of optimizing GPU execution and inference latency at scale
- Deep understanding of distributed systems and low-latency networking
- Ability to lead complex technical initiatives from conceptual design to production
- Strong expertise in languages and tools used for high-performance AI infrastructure
Benefits
- Competitive salary of 77,112 GBP
- Remote-first work environment
- Opportunity to lead critical AI infrastructure
- High-impact role in a fast-growing AI company
Keywords
AI InferenceGPU OptimizationLatency ReductionDistributed SystemsHigh-Performance ComputingScalabilityMLOps