tenstorrent / tt-metal
- воскресенье, 6 апреля 2025 г. в 00:00:02
🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.
Last Update: March 24, 2025
Notes:
- ttft = time to first token | t/s/u = tokens/second/user | t/s = tokens/second; where t/s = t/s/u * batch.
- TP = Tensor Parallel, DP = Data Parallel; Defines parallelization factors across multiple devices.
- The reported LLM performance is for an input sequence length (number of rows filled in the KV cache) of 128 for all models except Mamba (which can accept any sequence length).
- The t/s/u reported is the throughput of the first token generated after prefill, i.e. 1 / inter token latency.
Model | Batch | Hardware | ttft (ms) | t/s/u | Target t/s/u | t/s | TT-Metalium Release |
---|---|---|---|---|---|---|---|
Whisper (distil-large-v3) | 1 | n150 | 252 | 43.4 | 45 | 43.4 | v0.57.0-rc33 |
Model | Batch | Hardware | fps | Target fps | Release |
---|---|---|---|---|---|
ResNet-50 (224x224) | 16 | n150 | 4,700 | 7,000 | |
ResNet-50 (224x224) (DP=2) | 32 | n300 | 9,200 | 14,000 | |
ResNet-50 (224x224) (DP=8) | 128 | QuietBox | 35,800 | 56,000 | |
ResNet-50 (224x224) (DP=32) | 512 | Galaxy | 96,800 | 224,000 | |
ViT (224x224) | 8 | n150 | 1100 | 1,600 | |
Stable Diffusion 1.4 (512x512) | 1 | n150 | 0.167 | 0.3 | |
YOLOv4 (320x320) | 1 | n150 | 120 | 300 | |
YOLOv4 (640x640) | 1 | n150 | 50 | 100 | |
SegFormer Semantic Segmentation (512x512) | 1 | n150 | 90 | 300 | |
Stable Diffusion 3.5 medium (512x512) | 1 | n150 | 0.06 | 0.3 |
Model | Batch | Hardware | sen/sec | Target sen/sec | Release |
---|---|---|---|---|---|
BERT-Large | 8 | n150 | 270 | 400 |
For the latest model updates and features, please see MODEL_UPDATES.md
For information on initial model procedures, please see Model Bring-Up and Testing
TT-Metalium is our low-level programming model, enabling kernel development for Tenstorrent hardware.
Get started with simple kernels.
This repo is a part of Tenstorrent’s bounty program. If you are interested in helping to improve tt-metal, please make sure to read the Tenstorrent Bounty Program Terms and Conditions before heading to the issues tab. Look for the issues that are tagged with both “bounty” and difficulty level!