NVIDIA NIM

NVIDIA NIM

active

NVIDIA's free DGX Cloud catalog — 100+ open models (DeepSeek V4, Llama, Nemotron, Kimi, GLM, gpt-oss, Qwen) accessible through an OpenAI-compatible endpoint.

https://integrate.api.nvidia.com/v1

Avg Latency

90-day Uptime

%

Rate Limits

40 RPM / — RPD

Sign-up Required

Yes

Info

Base URL https://integrate.api.nvidia.com/v1
Sign-up Required Yes
Credit Card Not required
Context Window 1M
Last Verified 2026-05-20
openai-compatiblemulti-providerdeepseekllamavisionthinkingfunction-call

Models (15)

meta/llama-3.3-70b-instruct

CTX: 128K · 40 RPM

active

meta/llama-3.1-70b-instruct

CTX: 128K · 40 RPM

active

meta/llama-3.1-8b-instruct

CTX: 128K · 40 RPM

active

meta/llama-4-maverick-17b-128e-instruct

CTX: 128K · 40 RPM

active

deepseek-ai/deepseek-v4-pro

CTX: 164K · 40 RPM

active

deepseek-ai/deepseek-v4-flash

CTX: 164K · 40 RPM

active

nvidia/llama-3.1-nemotron-ultra-253b-v1

CTX: 128K · 40 RPM

active

nvidia/llama-3.3-nemotron-super-49b-v1.5

CTX: 128K · 40 RPM

active

nvidia/nemotron-3-super-120b-a12b

CTX: 128K · 40 RPM

active

moonshotai/kimi-k2.6

CTX: 256K · 40 RPM

active

openai/gpt-oss-120b

CTX: 128K · 40 RPM

active

openai/gpt-oss-20b

CTX: 128K · 40 RPM

active

qwen/qwen3-coder-480b-a35b-instruct

CTX: 262K · 40 RPM

active

z-ai/glm-5.1

CTX: 200K · 40 RPM

active

mistralai/mixtral-8x22b-instruct-v0.1

CTX: 64K · 40 RPM

active

Quick Start

bash
export API_KEY="your_api_key_here"
curl https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "meta/llama-3.3-70b-instruct",
    "messages": [
      {"role": "user", "content": "Hello! How are you?"}
    ]
  }'