📖 API Documentation

ModelBridge provides a fully OpenAI-compatible API. If you've used the OpenAI SDK before, you already know how to use us — just change the base URL.

🚀 Quick Start

Get your first response in under 3 minutes. No credit card required.

1

Create a Free Account

Visit /auth/register.html and sign up with your email. No credit card needed.

2

Get Your API Key

After login, go to the Dashboard. Your API key (starts with mb-) is displayed at the top.

3

Make Your First Call

Copy the code below (Python / curl / Node.js) and replace YOUR_API_KEY with your actual key.

4

Check Usage

Visit the Dashboard anytime to view your token usage, remaining quota, and billing status.

First API Call — Copy & Paste

Python
curl
Node.js
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://aibridge-api.com/v1",
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

✅ Expected response: a normal chat reply from DeepSeek V3. If you get 401, double-check your API key.

🔑 Authentication

How to Get Your API Key

  1. Go to aibridge-api.com/auth/register.html and create a free account
  2. After registration, you'll be redirected to the Dashboard
  3. Your API key (format: mb-xxxxxxxxxxxxxxxxx) is displayed at the top of the dashboard
  4. Click the copy button to copy it — you won't be able to see it again after leaving the page (but you can always generate a new one)

Using the API Key

All requests require an API key in the Authorization header:

Authorization: Bearer mb-xxxxxxxxxxxxx
⚠️ Security Note: Never expose your API key in client-side code (browsers). Always call the API from your backend server. If your key is compromised, generate a new one immediately from the Dashboard.

🌐 Base URL

https://aibridge-api.com/v1

🔗 Available Endpoints

MethodEndpointDescription
GET/v1/modelsList available models
POST/v1/chat/completionsChat completion (supports streaming)
POST/v1/embeddingsGenerate text embeddings
GET/healthService health check

🤖 Supported Models

Model IDTypeContext WindowBest For
deepseek-chatChat64KGeneral purpose, coding (V3)
deepseek-reasonerReasoning64KComplex reasoning, math, logic
deepseek-coderCoding64KCode generation & debugging (V2.5)
deepseek-v4-proChat128KFlagship reasoning model, top-tier performance
deepseek-v4-flashChat128KFast & lightweight, quick responses
qwen-maxChat32KMultilingual tasks, long context
qwen-plusChat131KCost-effective general usage
qwen3-235b-a22bChat128KFlagship Qwen3 model, best overall
glm-4-plusChat128KAdvanced reasoning, complex tasks
glm-4-airChat128KBalanced performance & speed
glm-4-flashChat128KFast & lightweight, cost-effective
moonshot-v1-8kChat8KQuick conversations
moonshot-v1-32kChat32KMedium context length
moonshot-v1-128kChat128KLong documents, deep analysis

💬 Chat Completions

The primary endpoint for generating text responses. Supports both regular and streaming (SSE) modes.

Request Parameters

ParameterTypeRequiredDescription
modelstringRequiredModel ID (e.g. deepseek-chat)
messagesarrayRequiredList of message objects. Each has role (system/user/assistant) and content (string)
temperaturefloatOptionalSampling temperature (0–2). Higher = more random. Default: 1.0
max_tokensintOptionalMax tokens to generate. Default varies by model
top_pfloatOptionalNucleus sampling (0–1). Default: 1.0
streambooleanOptionalIf true, returns SSE stream. Default: false
stopstring/arrayOptionalUp to 4 sequences where the API will stop generating
presence_penaltyfloatOptionalPenalty for repeated tokens (-2 to 2). Default: 0
frequency_penaltyfloatOptionalPenalty based on token frequency (-2 to 2). Default: 0

Response Fields

FieldTypeDescription
idstringUnique completion ID (e.g. chatcmpl-xxxxx)
objectstringAlways "chat.completion"
createdintUnix timestamp of creation
modelstringModel ID used for the completion
choicesarrayList of completion choices. Each has index, message, finish_reason
usageobjectToken usage: prompt_tokens, completion_tokens, total_tokens

Request Example

Python
curl
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Explain quantum computing in 3 sentences."},
    ],
    temperature=0.7,
    max_tokens=500,
)
print(response.choices[0].message.content)
print("Tokens used:", response.usage.total_tokens)

Streaming Response

Set stream: true to receive tokens as they are generated:

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

🔢 Embeddings

Generate vector embeddings from text. Compatible with OpenAI's /v1/embeddings endpoint.

Python
curl
response = client.embeddings.create(
    model="text-embedding-ada-002",
    input="The food was delicious",
)
print(response.data[0].embedding)  # List[float]
print(response.usage.prompt_tokens)
Note: Embeddings support varies by underlying model provider. Check the Supported Models table for availability.

⚡ Rate Limits

PlanRequests / Min (RPM)Tokens / Min (TPM)
Free2050,000
Pro60200,000
CustomCustomCustom

If you exceed rate limits, you'll receive a 429 error. Implement exponential backoff in production.

⚠️ Error Codes

Status CodeErrorCauseSolution
401UnauthorizedInvalid or missing API keyCheck your Authorization header format: Bearer mb-xxx
404Not FoundInvalid endpoint or model IDVerify the URL and model name
429Rate LimitedToo many requestsReduce frequency; upgrade plan if needed
500Internal ErrorUpstream service issueRetry after a few seconds
502Bad GatewayUpstream unavailableUpstream AI provider is down
504Gateway TimeoutRequest took too longTry with shorter prompts