Cerebras
activeUltra-fast LLM inference via custom wafer-scale chips — free tier offers 30 RPM and 1M tokens/day with no credit card required.
https://api.cerebras.ai/v1
Avg Latency
—
90-day Uptime
—%
Rate Limits
30 RPM / 14,400 RPD
Sign-up Required
Yes
Info
Base URL https://api.cerebras.ai/v1
Sign-up Required Yes
Credit Card Not required
Context Window 128K
Last Verified 2026-03-30
fast-inferenceopenai-compatiblellamafunction-call
Models (4)
llama3.1-8b
CTX: 128K · 30 RPM
gpt-oss-120b
CTX: 128K · 30 RPM
qwen-3-235b-a22b-instruct-2507
CTX: 128K · 30 RPM
zai-glm-4.7
CTX: 128K · 10 RPM
Quick Start
export API_KEY="your_api_key_here"
curl https://api.cerebras.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "llama3.1-8b",
"messages": [
{"role": "user", "content": "Hello! How are you?"}
]
}'