Pricing

Fireworks AI pricing context

Human-reviewed pricing summary paired with DevTune’s public AI search visibility benchmark.

Reviewed pricing summary

Fireworks AI uses a usage-based, pay-as-you-go model with no required subscription.
Serverless inference starts at $0.10/1M tokens for models under 4B parameters, $0.20/1M for 4B–16B, $0.90/1M for models over 16B, and model-specific rates for frontier models (e.g., DeepSeek V3 family at $0.56 input/$1.68 output per 1M tokens).
Batch inference is priced at 50% of serverless rates; cached input tokens at 50%.
On-demand GPU deployments are billed per second: H100 and H200 at $7/hr, B200 at $10/hr, B300 at $12/hr.
Fine-tuning via LoRA SFT starts at $0.50/1M training tokens for models up to 16B parameters; full-parameter SFT from $1.00/1M.
Reinforcement fine-tuning is billed at the same per-GPU-second rate as on-demand deployment.
New accounts receive $1 in free starter credits.
Enterprise pricing is available via direct contract.

View full AI visibility report Compare alternatives

Benchmark context

of 10 in LLM Inference & Serverless GPU

0.0%

AI search visibility

Sources and verification

Pricing changes often. Treat this page as evaluation context and verify contract terms, usage limits, and add-ons against the vendor’s current materials before making a buying decision.

fireworks.ai fireworks.ai fireworks.ai fireworks.ai businesswire.com sacra.com