openbench - openbench
Welcome to openbench!
openbench is an open-source framework for standardized, reproducible benchmarking of large language models (LLMs). Our goal is to make evaluation both rigorous and accessible:
- Run industry-standard benchmarks easily on any model, wherever it’s hosted.
- Design and run evaluations tailored to your specific needs.
- Choose from 30+ evaluation suites spanning knowledge, reasoning, coding, mathematics, and more.
With openbench, you can build trust in model performance through transparent, reproducible, and domain-relevant evaluation.
Quick Start
Key Features
Works with Any Model Provider
openbench supports 15+ model providers out of the box.
Stay Updated
We are rapidly iterating! Sign up below to recieve updates about latest openbench features.