Assisters API
Models

Chat Models

Conversational AI models for chat completions

Chat Models

Generate conversational responses with Assisters Chat, our flagship conversational AI model. Supports streaming and is fully OpenAI-compatible.

Assisters Chat v1

Model IDstring

assisters-chat-v1

Our advanced conversational AI model with state-of-the-art reasoning capabilities and a 128K context window.

SpecificationValue
Model IDassisters-chat-v1
Context Window128,000 tokens
Max Output8,192 tokens
Input Price$0.10 / million tokens
Output Price$0.20 / million tokens
Latency~200ms first token

Capabilities

  • Advanced Reasoning: Complex problem-solving and logical analysis
  • Creative Writing: Stories, articles, and creative content
  • Code Generation: Write and explain code in multiple languages
  • Multilingual: Supports 100+ languages
  • Long Context: Process up to 128K tokens in a single request
  • Instruction Following: Precise adherence to detailed instructions

Example Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
)

print(response.choices[0].message.content)

With Streaming

stream = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "user", "content": "Write a short story about space exploration."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Parameters

All chat models support these parameters:

ParameterTypeDefaultDescription
messagesarrayrequiredConversation history
temperaturefloat0.7Randomness (0-2)
max_tokensint1024Maximum output length
top_pfloat1.0Nucleus sampling
streamboolfalseEnable streaming
stoparraynullStop sequences
presence_penaltyfloat0Penalize repeated topics
frequency_penaltyfloat0Penalize repeated tokens

Best Practices

Use System Messages

Set behavior and context with system messages for consistent results

Enable Streaming

Use stream=true for better UX with longer responses

Manage Context

Trim old messages to stay within the 128K context limit

Adjust Temperature

Lower (0.1-0.3) for factual tasks, higher (0.7-1.0) for creative writing

Use Cases