Models
Chat Models
Conversational AI models for chat completions
Chat Models
Generate conversational responses with Assisters Chat, our flagship conversational AI model. Supports streaming and is fully OpenAI-compatible.
Assisters Chat v1
Model IDstringassisters-chat-v1
Our advanced conversational AI model with state-of-the-art reasoning capabilities and a 128K context window.
| Specification | Value |
|---|---|
| Model ID | assisters-chat-v1 |
| Context Window | 128,000 tokens |
| Max Output | 8,192 tokens |
| Input Price | $0.10 / million tokens |
| Output Price | $0.20 / million tokens |
| Latency | ~200ms first token |
Capabilities
- Advanced Reasoning: Complex problem-solving and logical analysis
- Creative Writing: Stories, articles, and creative content
- Code Generation: Write and explain code in multiple languages
- Multilingual: Supports 100+ languages
- Long Context: Process up to 128K tokens in a single request
- Instruction Following: Precise adherence to detailed instructions
Example Usage
from openai import OpenAI
client = OpenAI(
base_url="https://api.assisters.dev/v1",
api_key="your-api-key"
)
response = client.chat.completions.create(
model="assisters-chat-v1",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
]
)
print(response.choices[0].message.content)With Streaming
stream = client.chat.completions.create(
model="assisters-chat-v1",
messages=[
{"role": "user", "content": "Write a short story about space exploration."}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Parameters
All chat models support these parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
messages | array | required | Conversation history |
temperature | float | 0.7 | Randomness (0-2) |
max_tokens | int | 1024 | Maximum output length |
top_p | float | 1.0 | Nucleus sampling |
stream | bool | false | Enable streaming |
stop | array | null | Stop sequences |
presence_penalty | float | 0 | Penalize repeated topics |
frequency_penalty | float | 0 | Penalize repeated tokens |
Best Practices
Use System Messages
Set behavior and context with system messages for consistent results
Enable Streaming
Use stream=true for better UX with longer responses
Manage Context
Trim old messages to stay within the 128K context limit
Adjust Temperature
Lower (0.1-0.3) for factual tasks, higher (0.7-1.0) for creative writing