Pricing

Lepton AI pricing context

Human-reviewed pricing summary paired with DevTune’s public AI search visibility benchmark.

Reviewed pricing summary

Pre-acquisition, Lepton AI offered consumption-based per-token pricing for serverless LLM inference and hourly GPU rental rates for dedicated instances.
Third-party benchmark analysis placed Lepton AI's blended per-token cost for Llama 3.1 70B at approximately $0.80 per 1 million tokens, comparable to Together AI ($0.88) and Fireworks AI ($0.90).
The platform also offered GPU-backed dedicated instances with competitive hourly rates.
Post-acquisition pricing is managed through NVIDIA DGX Cloud Lepton partner marketplaces (CoreWeave, Lambda, Nebius, etc.) and is not centrally published; current pricing should be verified directly with NVIDIA or individual cloud partners.

View full AI visibility report Compare alternatives

Benchmark context

of 10 in LLM Inference & Serverless GPU

0.0%

AI search visibility

Sources and verification

Pricing changes often. Treat this page as evaluation context and verify contract terms, usage limits, and add-ons against the vendor’s current materials before making a buying decision.

github.com nvidianews.nvidia.com developer.nvidia.com techcrunch.com technode.com datacenterdynamics.com