Baseten
Welcome to Baseten
Baseten is an AI infrastructure platform. We combine applied performance research, distributed multi-cloud infrastructure, and developer tooling to run models of all modalities in production.
Get started:
- Deploy an open-source model in two clicks from the model library.
- Read our docs to package and serve a fine-tuned or custom model.
Pinned Loading
Repositories
Showing 10 of 107 repositories
-
truss Public
The simplest way to serve AI/ML models in production
-
ml-cookbook Public
Ready-to-use ML training recipes to help you build and deploy models on Baseten.
-
Model-Optimizer Public Forked from NVIDIA/Model-Optimizer
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, vLLM, etc. to optimize inference speed.
-
genai-bench Public Forked from sgl-project/genai-bench
Genai-bench is a powerful benchmark tool designed for comprehensive token-level performance evaluation of large language model (LLM) serving systems.