Llama 3.1 8B Instant

Groq

Meta's Llama 3.1 8B running on Groq's LPU hardware for ultra-fast inference at minimal cost

Pricing

per 1M tokens

Input / 1M$0.05
Output / 1M$0.08

Specifications

API Model IDllama-3.1-8b-instant
Context Window128K tokens
Max Output8K tokens

Modalities

text

Capabilities

tool-usestreamingjson-modecode

Other Groq Text / Chat Models

ModelInput / 1MOutput / 1MCache Read / 1MCache Write / 1M
GPT OSS 20B$0.07$0.30
Llama 4 Scout$0.11$0.34
GPT OSS 120B$0.15$0.60
Llama 4 Maverick$0.20$0.60
Qwen3 32B$0.29$0.59
Llama 3.3 70B Versatile$0.59$0.79
Kimi K2$1.00$3.00