vram.run Models Hardware Providers Cloud State of Inference

API provider data is live · Hardware & cloud pricing curated 2026-02-23

Featherless vs Together AI

207 vs 34 models, 16 shared

Shared models

Model	Featherless $/1M out	Featherless tok/s	Together AI $/1M out	Together AI tok/s
MiniMax-M2.7				155 tok/s
MiniMax-M3				19 tok/s
Qwen2.5-7B-Instruct				123 tok/s
Qwen3.5-397B-A17B				102 tok/s
Qwen3.5-9B				103 tok/s
DeepSeek-V4-Pro				57 tok/s
gemma-4-31B-it				72 tok/s
Llama-3.3-70B-Instruct				37 tok/s
Meta-Llama-3-8B-Instruct				106 tok/s
Kimi-K2.6				176 tok/s
Kimi-K2.7-Code				54 tok/s
gpt-oss-120b				107 tok/s
gpt-oss-20b				143 tok/s
GLM-5				107 tok/s
GLM-5.1				54 tok/s
GLM-5.2				111 tok/s

Install CLI [email protected] Raw data · MIT · API data: live · HW/Cloud data: curated 2026-02-23 · v0.6.0