Chat Models

Generate conversational responses with Assisters Chat, our flagship conversational AI model. Supports streaming and is fully OpenAI-compatible.

Assisters Chat v1

Model IDstring

assisters-chat-v1

Our advanced conversational AI model with state-of-the-art reasoning capabilities and a 128K context window.

Specification	Value
Model ID	`assisters-chat-v1`
Context Window	128,000 tokens
Max Output	8,192 tokens
Input Price	$0.10 / million tokens
Output Price	$0.20 / million tokens
Latency	~200ms first token

Capabilities

Advanced Reasoning: Complex problem-solving and logical analysis
Creative Writing: Stories, articles, and creative content
Code Generation: Write and explain code in multiple languages
Multilingual: Supports 100+ languages
Long Context: Process up to 128K tokens in a single request
Instruction Following: Precise adherence to detailed instructions

Example Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

response = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ]
)

print(response.choices[0].message.content)

With Streaming

stream = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "user", "content": "Write a short story about space exploration."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Parameters

All chat models support these parameters:

Parameter	Type	Default	Description
`messages`	array	required	Conversation history
`temperature`	float	0.7	Randomness (0-2)
`max_tokens`	int	1024	Maximum output length
`top_p`	float	1.0	Nucleus sampling
`stream`	bool	false	Enable streaming
`stop`	array	null	Stop sequences
`presence_penalty`	float	0	Penalize repeated topics
`frequency_penalty`	float	0	Penalize repeated tokens

Best Practices

Use System Messages

Set behavior and context with system messages for consistent results

Enable Streaming

Use stream=true for better UX with longer responses

Manage Context

Trim old messages to stay within the 128K context limit

Adjust Temperature

Lower (0.1-0.3) for factual tasks, higher (0.7-1.0) for creative writing

Use Cases

Assisters Code v1

Optimized for code generation and debugging

Assisters Vision v1

Analyze images and visual content

Chat Models

Use System Messages

Enable Streaming

Manage Context

Adjust Temperature

Customer Support Bot

Content Generation

Data Analysis

Assisters Code v1

Assisters Vision v1

On this page