All positions

Research Scientist — Benchmarks

ResearchSan Francisco / RemoteFull-time

Design rigorous benchmarks that expose real capability gaps in frontier models. Collaborate with domain experts to translate professional workflows into measurable evaluation criteria used industry-wide.

EvaluationStatisticsNLP

Apply

Send us your details and we'll get back to you if there's a fit.

Fields marked with * are required.