Assisters API
API Reference

Audio

Convert text to natural-sounding speech with the audio endpoints

Audio

Generate natural-sounding speech from text. The audio API is OpenAI-compatible and returns audio bytes directly.

Text-to-Speech

Endpoint

POST https://api.assisters.dev/v1/audio/speech

Request Body

stringrequired

The TTS model to use. Example: assisters-tts-v1

stringrequired

The text to synthesize into speech. Maximum 4096 characters.

stringdefault: alloy

The voice to use. Options: alloy, echo, fable, onyx, nova, shimmer

stringdefault: mp3

The output audio format. Options: mp3, opus, aac, flac, wav, pcm

numberdefault: 1.0

Playback speed multiplier. Range: 0.25 to 4.0.

Response

Returns raw audio bytes with Content-Type: audio/mpeg (or the format requested).

Example

curl https://api.assisters.dev/v1/audio/speech \
  -H "Authorization: Bearer $ASSISTERS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assisters-tts-v1",
    "input": "Hello! Welcome to Assisters.",
    "voice": "nova"
  }' \
  --output speech.mp3
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.assisters.dev/v1",
  apiKey: process.env.ASSISTERS_API_KEY,
});

const response = await client.audio.speech.create({
  model: "assisters-tts-v1",
  input: "Hello! Welcome to Assisters.",
  voice: "nova",
});

const buffer = Buffer.from(await response.arrayBuffer());
await fs.writeFile("speech.mp3", buffer);
from openai import OpenAI

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key=os.environ["ASSISTERS_API_KEY"],
)

response = client.audio.speech.create(
    model="assisters-tts-v1",
    input="Hello! Welcome to Assisters.",
    voice="nova",
)

response.stream_to_file("speech.mp3")

Audio Transcription

Endpoint

POST https://api.assisters.dev/v1/audio/transcriptions

Request Body

filerequired

The audio file to transcribe. Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm

stringrequired

The transcription model. Example: assisters-stt-v1

string

ISO-639-1 language code (e.g. en, es, fr). Auto-detected if omitted.

stringdefault: json

Output format. Options: json, text, srt, vtt, verbose_json

Response

{
  "text": "Hello, welcome to Assisters."
}

Example

curl https://api.assisters.dev/v1/audio/transcriptions \
  -H "Authorization: Bearer $ASSISTERS_API_KEY" \
  -F "[email protected]" \
  -F "model=assisters-stt-v1"