Pricing

Together AI pricing context

Human-reviewed pricing summary paired with DevTune’s public AI search visibility benchmark.

Reviewed pricing summary

Together AI uses a pay-as-you-go model with three primary pricing tiers.
Serverless inference is charged per million tokens, with separate input and output rates varying by model; prices range from approximately $0.05 to $7.00 per million tokens.
Batch inference is priced at 50% of real-time API rates for most models, with support for up to 30B enqueued tokens.
Fine-tuning is billed per million tokens processed during training, varying by model size and method (LoRA vs. full fine-tuning, SFT vs.
DPO).
GPU clusters are available on a pay-as-you-go hourly basis (approximately $3.49/hr for H100, $4.19/hr for H200, $7.49/hr for B200) or as reserved capacity with commitment discounts for periods over 6 days.
Dedicated model inference endpoints are billed per minute of usage.
Managed storage and sandbox environments carry additional fees.
Enterprise and AI Factory deployments require custom pricing via sales.

View full AI visibility report Compare alternatives

Benchmark context

of 10 in LLM Inference & Serverless GPU

6.7%

AI search visibility

Sources and verification

Pricing changes often. Treat this page as evaluation context and verify contract terms, usage limits, and add-ons against the vendor’s current materials before making a buying decision.

together.ai together.ai together.ai together.ai together.ai together.ai