vram.run Models Hardware Providers Cloud State of Inference
API provider data is live · Hardware & cloud pricing curated 2026-02-23

Novita vs Together AI

67 vs 29 models, 17 shared

Shared models

ModelNovita $/1M outNovita tok/sTogether AI $/1M outTogether AI tok/s
Qwen3-235B-A22B-Instruct-250734 tok/s23 tok/s
Qwen3-Coder-480B-A35B-Instruct64 tok/s58 tok/s
Qwen3-Next-80B-A3B-Instruct101 tok/s136 tok/s
Qwen3-VL-8B-Instruct67 tok/s63 tok/s
Qwen3.5-397B-A17B47 tok/s18 tok/s
DeepSeek-R140 tok/s68 tok/s
DeepSeek-R1-052838 tok/s84 tok/s
DeepSeek-V333 tok/s42 tok/s
DeepSeek-V3-032432 tok/s31 tok/s
DeepSeek-V3.136 tok/s40 tok/s
Llama-3.3-70B-Instruct27 tok/s108 tok/s
Llama-4-Maverick-17B-128E-Instruct-FP889 tok/s48 tok/s
Kimi-K2.5102 tok/s64 tok/s
gpt-oss-120b51 tok/s81 tok/s
gpt-oss-20b97 tok/s76 tok/s
GLM-4.6100 tok/s42 tok/s
GLM-539 tok/s42 tok/s
Install CLI [email protected] Raw data · MIT · API data: live · HW/Cloud data: curated 2026-02-23 · v0.6.0