Skip to main content

Endpoint

POST /api/chat/completions
Compatible with the OpenAI chat completions format. Supports streaming, multimodal input (images and video), tool calling, and structured output.

Request Parameters

ParameterTypeRequiredDefaultDescription
modelstringyesModel name (e.g. gpt-4o-mini, claude-sonnet-4-6, gpt-4-1-nano-2025-04-14)
messagesarrayyesArray of message objects. Must not be empty.
streambooleannofalseStream the response as server-sent events.
max_tokensintegernovariesMaximum tokens in the response.
temperaturenumbernovariesSampling temperature (0-2).
top_pnumbernoNucleus sampling parameter.
frequency_penaltynumbernoPenalize repeated tokens.
presence_penaltynumbernoPenalize tokens already present.
toolsarraynoTool/function definitions for tool calling.
tool_choicestring/objectnoControl tool selection behavior.
parallel_tool_callsbooleannoAllow parallel tool calls.
response_formatobjectnoConstrain response format (e.g. {"type": "json_object"}).

Message Format

Each message has a role and content:
[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "Hello!"},
  {"role": "assistant", "content": "Hi there!"}
]

Vision (multimodal)

Use a content array to include images or video:
{
  "role": "user",
  "content": [
    {"type": "text", "text": "What's in this image?"},
    {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
  ]
}
Video input:
{
  "role": "user",
  "content": [
    {"type": "text", "text": "Describe this video"},
    {"type": "video_url", "video_url": {"url": "https://example.com/clip.mp4"}}
  ]
}
Image and video URLs must be publicly accessible.

Examples

Basic text generation

from openai import OpenAI

client = OpenAI(
    base_url="https://hub.oxen.ai/api",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-4-1-nano-2025-04-14",
    messages=[{"role": "user", "content": "Say hello in exactly 3 words."}],
    max_tokens=50,
    temperature=0.1,
)

print(response.choices[0].message.content)

Response

{
  "id": "chatcmpl-97eab7db-fe67-4b29-900c-ed5260c654d4",
  "object": "chat.completion",
  "created": 1775090332,
  "model": "gpt-4-1-nano-2025-04-14",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello, how are?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 5,
    "total_tokens": 20
  }
}

Streaming

from openai import OpenAI

client = OpenAI(
    base_url="https://hub.oxen.ai/api",
    api_key="YOUR_API_KEY",
)

stream = client.chat.completions.create(
    model="gpt-4-1-nano-2025-04-14",
    messages=[{"role": "user", "content": "Say hello"}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)
print()
Returns server-sent events. Each chunk has a delta instead of a full message:
data: {"choices":[{"delta":{"content":"Hello"},"finish_reason":null,"index":0}],"created":1775090334,"id":"chatcmpl-...","model":"gpt-4-1-nano-2025-04-14","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":" there"},"finish_reason":null,"index":0}],...}

data: [DONE]

Tool calling

from openai import OpenAI

client = OpenAI(
    base_url="https://hub.oxen.ai/api",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-4-1-nano-2025-04-14",
    messages=[
        {"role": "system", "content": "Use tools when appropriate."},
        {"role": "user", "content": "What is the weather in San Francisco?"},
    ],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {"location": {"type": "string"}},
                "required": ["location"],
            },
        },
    }],
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"{tool_call.function.name}({tool_call.function.arguments})")
When the model uses a tool, finish_reason is "tool_calls":
{
  "choices": [{
    "finish_reason": "tool_calls",
    "message": {
      "content": null,
      "role": "assistant",
      "tool_calls": [{
        "id": "call_GRNwPXnbuQW4Sa3QNB3FYkYw",
        "type": "function",
        "function": {
          "name": "get_weather",
          "arguments": "{\"location\":\"San Francisco\"}"
        }
      }]
    }
  }]
}

Structured output (JSON mode)

from openai import OpenAI

client = OpenAI(
    base_url="https://hub.oxen.ai/api",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "List 3 colors as a JSON array"}],
    response_format={"type": "json_object"},
    max_tokens=100,
)

print(response.choices[0].message.content)

Errors

ConditionError
No model specified"You must specify a model to call"
Model not found"Model not found: <name>"
Empty messages"Messages array cannot be empty"
Insufficient creditsCredit-related error message