Llama-4-Maverick-17B-128E-Instruct
by meta-llama
402B params · image-text-to-text · 468 likes · 6.5k downloads
Llama-4-Maverick-17B-128E-Instruct is a 402B parameter model. At Q4 quantization it requires 201GB of VRAM. It requires a GPU with at least 201GB of VRAM.
Inference providers
| Provider | $/1M in | $/1M out | Throughput |
|---|---|---|---|
| SambaNova | 340 tok/s |