Llama 3.1 8B

Cerebras

Meta's Llama 3.1 8B running on Cerebras wafer-scale silicon for ultra-fast inference speeds

Pricing

per 1M tokens

Input / 1M$0.10
Output / 1M$0.10

Specifications

API Model IDllama3.1-8b
Context Window128K tokens
Max Output8K tokens

Modalities

text

Capabilities

tool-usestreamingjson-modecode

Other Cerebras Text / Chat Models

ModelInput / 1MOutput / 1MCache Read / 1MCache Write / 1M
GPT OSS 120B$0.35$0.75