vram.run Models Hardware Providers Cloud State of Inference
API provider data is live · Hardware & cloud pricing curated 2026-02-23

Fireworks vs Novita

15 vs 67 models, 13 shared

Shared models

ModelFireworks $/1M outFireworks tok/sNovita $/1M outNovita tok/s
MiniMax-M2.184 tok/s27 tok/s
MiniMax-M2.529 tok/s16 tok/s
Qwen3-VL-30B-A3B-Instruct162 tok/s118 tok/s
Qwen3-VL-30B-A3B-Thinking132 tok/s87 tok/s
DeepSeek-V3.179 tok/s36 tok/s
DeepSeek-V3.281 tok/s29 tok/s
Llama-3.3-70B-Instruct109 tok/s27 tok/s
Kimi-K2-Instruct-090540 tok/s24 tok/s
Kimi-K2-Thinking49 tok/s28 tok/s
Kimi-K2.567 tok/s102 tok/s
gpt-oss-120b90 tok/s51 tok/s
gpt-oss-20b143 tok/s97 tok/s
GLM-570 tok/s39 tok/s
Install CLI [email protected] Raw data · MIT · API data: live · HW/Cloud data: curated 2026-02-23 · v0.6.0