GitHub Models
activeFree model inference via GitHub PAT — access GPT-4o, Llama, Phi, DeepSeek and more. Low tier: 15 RPM/150 RPD; High tier: 10 RPM/50 RPD. No credit card required.
https://models.inference.ai.azure.com
Avg Latency
—
90-day Uptime
—%
Rate Limits
15 RPM / 150 RPD
Sign-up Required
Yes
Info
Base URL https://models.inference.ai.azure.com
Sign-up Required Yes
Credit Card Not required
Context Window 8K
Last Verified 2026-03-30
openai-compatiblemulti-modelvisionfunction-call
Models (5)
gpt-4o
CTX: 8K · 10 RPM
gpt-4o-mini
CTX: 8K · 15 RPM
meta-llama/Llama-3.3-70B-Instruct
CTX: 8K · 10 RPM
DeepSeek-R1
CTX: 8K · 10 RPM
microsoft/Phi-4
CTX: 8K · 15 RPM
Quick Start
export API_KEY="your_api_key_here"
curl https://models.inference.ai.azure.com/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Hello! How are you?"}
]
}'