GitHub Models
activeFree model inference via GitHub PAT — access GPT-5, GPT-4.1, o3/o4, Llama 4, DeepSeek R1, Grok 3, Mistral and more. Low tier: 15 RPM / 150 RPD; High tier: 10 RPM / 50 RPD; reasoning models on per-model custom quota. No credit card required.
https://models.github.ai/inference
Avg Latency
—
90-day Uptime
—%
Rate Limits
15 RPM / 150 RPD
Sign-up Required
Yes
Info
Models (32)
openai/gpt-4.1
CTX: 1M · 10 RPM
openai/gpt-4.1-mini
CTX: 1M · 15 RPM
openai/gpt-4.1-nano
CTX: 1M · 15 RPM
openai/gpt-4o
CTX: 128K · 10 RPM
openai/gpt-4o-mini
CTX: 128K · 15 RPM
openai/gpt-5
CTX: 200K · 1 RPM
openai/gpt-5-chat
CTX: 200K · 1 RPM
openai/gpt-5-mini
CTX: 200K · 2 RPM
openai/gpt-5-nano
CTX: 200K · 4 RPM
openai/o4-mini
CTX: 200K · 2 RPM
openai/o3
CTX: 200K · 1 RPM
openai/o3-mini
CTX: 200K · 3 RPM
openai/o1
CTX: 200K · 1 RPM
meta/llama-4-scout-17b-16e-instruct
CTX: 10M · 10 RPM
meta/llama-4-maverick-17b-128e-instruct-fp8
CTX: 1M · 10 RPM
meta/llama-3.3-70b-instruct
CTX: 128K · 10 RPM
meta/llama-3.2-90b-vision-instruct
CTX: 128K · 10 RPM
deepseek/deepseek-r1-0528
CTX: 128K · 2 RPM
deepseek/deepseek-r1
CTX: 128K · 2 RPM
deepseek/deepseek-v3-0324
CTX: 128K · 10 RPM
xai/grok-3
CTX: 128K · 2 RPM
xai/grok-3-mini
CTX: 128K · 3 RPM
mistral-ai/mistral-medium-2505
CTX: 128K · 15 RPM
mistral-ai/mistral-small-2503
CTX: 128K · 15 RPM
mistral-ai/codestral-2501
CTX: 256K · 15 RPM
microsoft/mai-ds-r1
CTX: 128K · 2 RPM
microsoft/phi-4-reasoning
CTX: 32K · 15 RPM
microsoft/phi-4-multimodal-instruct
CTX: 128K · 15 RPM
microsoft/phi-4-mini-reasoning
CTX: 128K · 15 RPM
microsoft/phi-4
CTX: 16K · 15 RPM
cohere/cohere-command-a
CTX: 128K · 15 RPM
ai21-labs/ai21-jamba-1.5-large
CTX: 256K · 10 RPM
Quick Start
export API_KEY="your_api_key_here"
curl https://models.github.ai/inference/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "openai/gpt-4.1-mini",
"messages": [
{"role": "user", "content": "Hello! How are you?"}
]
}'