SPEED AND SCALE, FROM PROTOTYPE TO PRODUCTION

GroqCloud

The AI inference platform built for developers. Fast responses, scalable performance, and costs you can plan for. Available in public, private, or co-cloud instances.

Built for speed and precision

Groq runs the models you care about.

Take advantage of fast AI inference performance, powered by our purpose-built LPU, for leading GenAI models across text, audio, and vision modalities.

  • Support for LLMs, STT, TTS, and image-to-text models
  • Optimized for popular models
  • Industry standard frameworks and integrations

Build now and scale as your needs grow

GroqCloud Plans

  • Free

    Great for anyone to get started with our APIs.

    • Build and Test on Groq
    • Community Support
    • Zero-data Retention Available
  • Developer

    Great for developers and startups to scale up and pay as you go

    Everything on the Starter Plan, plus:

    • Higher Token Limits
    • Chat Support
    • Flex Service Tier
    • Batch Processing
    • Spend Limits
    • Prompt Caching
  • Enterprise

    Great for businesses who require custom solutions for large-scale needs

    Everything on the Developer Plan, plus:

    • Custom Models
    • Regional Endpoint Selection
    • Performance Tier
    • Scalable Capacity
    • Dedicated Support
    • LoRA Fine-Tunes

Consistent Performance, Predictable Spend

Lower latency means less compute time, no batching required. Record-setting performance. Usage-based.

What inference provider are you using or considering using to access models?

Source: Artificial Analysis AI Adoption Survey 2025

Designed for inference. Not adapted for it.

Established in 2016 for inference, Groq is literally built different. It’s the only custom-built inference chip that fuels developers with the performance they need at a cost that doesn’t hold them back.

On-Prem Optionality

GroqRack

Available by request, the LPU powering GroqCloud can be deployed on-prem with GroqRack. Ideal for regulated industries or air-gapped environments. Seamless transition between cloud and local deployment.

Made to scale. Deployed globally.

Groq Data Center Deployments

Online now in four regions globally. Regional availability zones for minimal latency. Auto-scaling without overhead

SOC 2, GDPR, and HIPAA badges

Secure by Default

Enterprise-grade data encryption. SOC 2, GDPR, HIPAA compliant. Optional private tenancy for sensitive workloads.

Fuel for Developers

Start Building For Free