GitHub Models

GitHub Models

active

Free model inference via GitHub PAT — access GPT-5, GPT-4.1, o3/o4, Llama 4, DeepSeek R1, Grok 3, Mistral and more. Low tier: 15 RPM / 150 RPD; High tier: 10 RPM / 50 RPD; reasoning models on per-model custom quota. No credit card required.

https://models.github.ai/inference

Avg Latency

90-day Uptime

%

Rate Limits

15 RPM / 150 RPD

Sign-up Required

Yes

Info

Base URL https://models.github.ai/inference
Sign-up Required Yes
Credit Card Not required
Context Window 1M
Last Verified 2026-05-20
openai-compatiblemulti-modelvisionfunction-callreasoning

Models (32)

openai/gpt-4.1

CTX: 1M · 10 RPM

active

openai/gpt-4.1-mini

CTX: 1M · 15 RPM

active

openai/gpt-4.1-nano

CTX: 1M · 15 RPM

active

openai/gpt-4o

CTX: 128K · 10 RPM

active

openai/gpt-4o-mini

CTX: 128K · 15 RPM

active

openai/gpt-5

CTX: 200K · 1 RPM

active

openai/gpt-5-chat

CTX: 200K · 1 RPM

active

openai/gpt-5-mini

CTX: 200K · 2 RPM

active

openai/gpt-5-nano

CTX: 200K · 4 RPM

active

openai/o4-mini

CTX: 200K · 2 RPM

active

openai/o3

CTX: 200K · 1 RPM

active

openai/o3-mini

CTX: 200K · 3 RPM

active

openai/o1

CTX: 200K · 1 RPM

active

meta/llama-4-scout-17b-16e-instruct

CTX: 10M · 10 RPM

active

meta/llama-4-maverick-17b-128e-instruct-fp8

CTX: 1M · 10 RPM

active

meta/llama-3.3-70b-instruct

CTX: 128K · 10 RPM

active

meta/llama-3.2-90b-vision-instruct

CTX: 128K · 10 RPM

active

deepseek/deepseek-r1-0528

CTX: 128K · 2 RPM

active

deepseek/deepseek-r1

CTX: 128K · 2 RPM

active

deepseek/deepseek-v3-0324

CTX: 128K · 10 RPM

active

xai/grok-3

CTX: 128K · 2 RPM

active

xai/grok-3-mini

CTX: 128K · 3 RPM

active

mistral-ai/mistral-medium-2505

CTX: 128K · 15 RPM

active

mistral-ai/mistral-small-2503

CTX: 128K · 15 RPM

active

mistral-ai/codestral-2501

CTX: 256K · 15 RPM

active

microsoft/mai-ds-r1

CTX: 128K · 2 RPM

active

microsoft/phi-4-reasoning

CTX: 32K · 15 RPM

active

microsoft/phi-4-multimodal-instruct

CTX: 128K · 15 RPM

active

microsoft/phi-4-mini-reasoning

CTX: 128K · 15 RPM

active

microsoft/phi-4

CTX: 16K · 15 RPM

active

cohere/cohere-command-a

CTX: 128K · 15 RPM

active

ai21-labs/ai21-jamba-1.5-large

CTX: 256K · 10 RPM

active

Quick Start

bash
export API_KEY="your_api_key_here"
curl https://models.github.ai/inference/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      {"role": "user", "content": "Hello! How are you?"}
    ]
  }'