Pricing
Replicate pricing context
Human-reviewed pricing summary paired with DevTune’s public AI search visibility benchmark.
Reviewed pricing summary
- Replicate uses pure pay-as-you-go billing with no free tier.
- Public models are billed by the second based on GPU hardware: Nvidia T4 at $0.000225/sec ($0.81/hr), L40S at $0.000975/sec ($3.51/hr), A100 80GB at $0.001400/sec ($5.04/hr), and H100 at $0.001525/sec ($5.49/hr).
- Multi-GPU configurations up to 8×H100 are available via committed-spend contracts.
- Some models use per-output pricing (e.g., FLUX Schnell at $3.00/1,000 images; FLUX Dev at $0.025/image).
- LLM models use per-token rates (e.g., DeepSeek-R1 at $3.75/million input tokens).
- Private custom models run on dedicated hardware and accrue idle-time charges.
- Enterprise plans add a dedicated account manager, priority support, higher GPU limits, performance SLAs, and volume discounts.
Benchmark context
#10
of 10 in LLM Inference & Serverless GPU
0.0%
AI search visibility
Sources and verification
Pricing changes often. Treat this page as evaluation context and verify contract terms, usage limits, and add-ons against the vendor’s current materials before making a buying decision.