We believe in building tools developers love – and we see the next frontier in AI-powered developer tools. The DPAI Arena project aims to define and maintain an open, community-driven benchmark for evaluating AI agents and IDE-embedded AI features at scale. We believe that the benchmark industry is only in its early stages, and we’re planning to take part in the evolution of the field.
- We’re looking for a Senior Product Manager to own the direction of DPAI Arena and help define what “good evaluation” looks like and how we measure it.
- What you’ll do
- Refine our answer to the question of what makes an evaluation task a representative data point for software engineers and IDE AI features , using usage data, feedback loops, and community input.
- Define and iterate the evaluation methodology, designing tasks, specifying metrics and success criteria, and covering real developer workflows from both product and UX perspectives.
- Build and maintain relationships with external vendors, platform partners, outsourcing providers, and internal stakeholders to expand participation in DPAI Arena and drive both business and ecosystem development.
- Curate and manage datasets, coordinating contributions from JetBrains teams, external partners, and outsourcing vendors (including crowdsourced workers when needed) to keep evaluation tasks high-quality, representative, and diverse.
- Own the evaluation standard and roadmap, representing DPAI Arena internally and externally while aligning plans with the broader JetBrains AI and IDE strategy.
- What we expect from you
- Proven experience in product management (developer tools or data-driven products is a plus).
- Experience working on GenAI features for developer tooling.
- Strong analytical skills, with the ability to define meaningful metrics, interpret results, and turn them into product design.
- Experience working with multiple stakeholders: internal teams, external partners, and vendors (including outsourcing providers).
- Understanding of developer workflows, IDEs, and AI-assisted development, and the ability to translate real-world coding tasks into evaluation workloads.
- Strong communication skills, both written and spoken.
- Ownership mindset and proactive approach.
- Engineering background (particularly SWE or QA for tech products).
- Bonus points
- Experience maintaining or curating open-source products.
- ML experience.
Familiarity with outsourcing or crowdsourcing platforms, data collection pipelines, or community-driven ecosystems. #LI-HYBRID
- We process the data provided in your job application in accordance with the Recruitment Privacy Policy.