Chat Completions

TL;DR

POST to /v1/chat/completions with model, messages array. Supports streaming (stream: true), temperature, max_tokens, tools/function calling. Models: assisters-chat-v1 (general), assisters-code-v1 (code), assisters-vision-v1 (images).

Generate AI responses for conversational applications. This endpoint is fully compatible with the OpenAI Chat Completions API.

Endpoint

POST https://api.assisters.dev/v1/chat/completions

Request Body

stringrequired

The model to use for completion. See available models.

Examples: assisters-chat-v1, assisters-vision-v1, assisters-code-v1

arrayrequired

An array of messages comprising the conversation so far.

Each message object has:

role (string): system, user, or assistant
content (string): The content of the message

booleandefault: false

If true, returns a stream of Server-Sent Events (SSE) for real-time responses.

integer

Maximum number of tokens to generate. Defaults to model's maximum.

numberdefault: 1.0

Sampling temperature between 0 and 2. Higher values make output more random.

numberdefault: 1.0

Nucleus sampling parameter. Use this OR temperature, not both.

string | array

Up to 4 sequences where the API will stop generating tokens.

numberdefault: 0

Penalty for new tokens based on whether they appear in the text so far. Range: -2.0 to 2.0.

numberdefault: 0

Penalty for new tokens based on their frequency in the text. Range: -2.0 to 2.0.

string

A unique identifier for the end-user, useful for monitoring and abuse detection.

Request Examples

Basic Request

from openai import OpenAI

client = OpenAI(
    api_key="ask_your_api_key",
    base_url="https://api.assisters.dev/v1"
)

response = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of Japan?"}
    ]
)

print(response.choices[0].message.content)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'ask_your_api_key',
  baseURL: 'https://api.assisters.dev/v1'
});

const response = await client.chat.completions.create({
  model: 'assisters-chat-v1',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'What is the capital of Japan?' }
  ]
});

console.log(response.choices[0].message.content);

curl https://api.assisters.dev/v1/chat/completions \
  -H "Authorization: Bearer ask_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assisters-chat-v1",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of Japan?"}
    ]
  }'

Streaming Request

from openai import OpenAI

client = OpenAI(
    api_key="ask_your_api_key",
    base_url="https://api.assisters.dev/v1"
)

stream = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "user", "content": "Write a short poem about coding"}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'ask_your_api_key',
  baseURL: 'https://api.assisters.dev/v1'
});

const stream = await client.chat.completions.create({
  model: 'assisters-chat-v1',
  messages: [
    { role: 'user', content: 'Write a short poem about coding' }
  ],
  stream: true
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  process.stdout.write(content);
}

Multi-turn Conversation

messages = [
    {"role": "system", "content": "You are a math tutor."},
    {"role": "user", "content": "What is 2 + 2?"},
    {"role": "assistant", "content": "2 + 2 equals 4."},
    {"role": "user", "content": "And what is that multiplied by 3?"}
]

response = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=messages
)

Response

Non-Streaming Response

{
  "id": "chatcmpl-abc123xyz",
  "object": "chat.completion",
  "created": 1706745600,
  "model": "assisters-chat-v1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of Japan is Tokyo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  }
}

Streaming Response

Each chunk in the stream:

{
  "id": "chatcmpl-abc123xyz",
  "object": "chat.completion.chunk",
  "created": 1706745600,
  "model": "assisters-chat-v1",
  "choices": [
    {
      "index": 0,
      "delta": {
        "content": "The"
      },
      "finish_reason": null
    }
  ]
}

Final chunk:

{
  "id": "chatcmpl-abc123xyz",
  "object": "chat.completion.chunk",
  "created": 1706745600,
  "model": "assisters-chat-v1",
  "choices": [
    {
      "index": 0,
      "delta": {},
      "finish_reason": "stop"
    }
  ]
}

Response Fields

idstring

Unique identifier for the completion

objectstring

Always chat.completion or chat.completion.chunk for streaming

createdinteger

Unix timestamp of when the completion was created

modelstring

The model used for completion

choicesarray

Array of completion choices. Each choice contains:

index: The index of this choice
message: The generated message (non-streaming)
delta: The incremental content (streaming)
finish_reason: Why generation stopped (stop, length, content_filter)

usageobject

Token usage statistics (not included in streaming):

prompt_tokens: Tokens in the input
completion_tokens: Tokens in the output
total_tokens: Total tokens used

Finish Reasons

Reason	Description
`stop`	Natural completion or stop sequence reached
`length`	`max_tokens` limit reached
`content_filter`	Content was filtered by moderation

Chat Completions

Chat Completions

Endpoint

Request Body

Request Examples

Basic Request

Streaming Request

Multi-turn Conversation

Response

Non-Streaming Response

Streaming Response

Response Fields

Finish Reasons

Error Responses

Best Practices

Use System Messages

Stream Long Responses

Manage Conversation Length

Handle Errors Gracefully

On this page

Chat Completions

400 Bad Request

401 Unauthorized

429 Rate Limit

Use System Messages

Stream Long Responses

Manage Conversation Length

Handle Errors Gracefully

On this page