torchrun
-
wandb Public Forked from wandb/wandb
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
-
tvm Public Forked from apache/tvm
Open Machine Learning Compiler Framework
-
rtp-llm Public Forked from alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
-
burn Public Forked from tracel-ai/burn
Burn is a next generation tensor library and Deep Learning Framework that doesn't compromise on flexibility, efficiency and portability.
-
TransformerEngine Public Forked from NVIDIA/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory utilization in both training and inference.
-
iree Public Forked from iree-org/iree
A retargetable MLIR-based machine learning compiler and runtime toolkit.