openbench - openbench

Welcome to openbench!

openbench is an open-source framework for standardized, reproducible benchmarking of large language models (LLMs). Our goal is to make evaluation both rigorous and accessible:

Run industry-standard benchmarks easily on any model, wherever it’s hosted.
Design and run evaluations tailored to your specific needs.
Choose from 30+ evaluation suites spanning knowledge, reasoning, coding, mathematics, and more.

With openbench, you can build trust in model performance through transparent, reproducible, and domain-relevant evaluation.

Quick Start

Key Features

Works with Any Model Provider

openbench supports 15+ model providers out of the box.

Stay Updated

We are rapidly iterating! Sign up below to recieve updates about latest openbench features.