◐ Shell clean mode source ↗

AarushSah - Overview

Skip to content

AarushSah Follow

AarushSah

Follow

Aarush Sah AarushSah

Meta Superintelligence Labs

Achievements

Achievements

Pinned Loading

groq/openbench groq/openbench Public

Provider-agnostic, open-source evaluation infrastructure for language models

Python 782 101
Set_Eval Set_Eval Public

novel benchmark for probing the visual reasoning capabilities of large language models

Python 4
eris-eval eris-eval Public

LLM evaluation framework that assesses model performance through simulated debates

Python 2