Content Moderation

Automatically detect and filter harmful, inappropriate, or policy-violating content. Use this endpoint to protect your users and maintain community standards.

Endpoint

POST https://api.assisters.dev/v1/moderate
POST https://api.assisters.dev/v1/moderations

Both paths are equivalent. /v1/moderations is the OpenAI-compatible alias.

Request Body

stringdefault: assisters-moderation-v1

The moderation model to use. See available models.

Example: assisters-moderation-v1

string | arrayrequired

The text to moderate. Can be a single string or an array of up to 100 strings.

Request Examples

Single Text

from openai import OpenAI

client = OpenAI(
    api_key="ask_your_api_key",
    base_url="https://api.assisters.dev/v1"
)

response = client.moderations.create(
    model="assisters-moderation-v1",
    input="Hello, how are you today?"
)

result = response.results[0]
print(f"Flagged: {result.flagged}")
print(f"Categories: {result.categories}")

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'ask_your_api_key',
  baseURL: 'https://api.assisters.dev/v1'
});

const response = await client.moderations.create({
  model: 'assisters-moderation-v1',
  input: 'Hello, how are you today?'
});

const result = response.results[0];
console.log(`Flagged: ${result.flagged}`);
console.log(`Categories:`, result.categories);

curl https://api.assisters.dev/v1/moderate \
  -H "Authorization: Bearer ask_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assisters-moderation-v1",
    "input": "Hello, how are you today?"
  }'

Batch Moderation

response = client.moderations.create(
    model="assisters-moderation-v1",
    input=[
        "First message to check",
        "Second message to check",
        "Third message to check"
    ]
)

for i, result in enumerate(response.results):
    print(f"Message {i}: Flagged={result.flagged}")

Pre-moderation Pattern

def moderate_before_response(user_message):
    """Check user input before processing"""
    moderation = client.moderations.create(
        model="assisters-moderation-v1",
        input=user_message
    )

    if moderation.results[0].flagged:
        return {
            "error": "Your message violates our content policy",
            "categories": moderation.results[0].categories
        }

    # Process the message normally
    response = client.chat.completions.create(
        model="assisters-chat-v1",
        messages=[{"role": "user", "content": user_message}]
    )

    return {"response": response.choices[0].message.content}

Response

{
  "id": "modr-abc123xyz",
  "model": "assisters-moderation-v1",
  "results": [
    {
      "flagged": false,
      "categories": {
        "hate": false,
        "hate/threatening": false,
        "harassment": false,
        "harassment/threatening": false,
        "self-harm": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "sexual": false,
        "sexual/minors": false,
        "violence": false,
        "violence/graphic": false
      },
      "category_scores": {
        "hate": 0.00012,
        "hate/threatening": 0.00001,
        "harassment": 0.00034,
        "harassment/threatening": 0.00002,
        "self-harm": 0.00001,
        "self-harm/intent": 0.00001,
        "self-harm/instructions": 0.00001,
        "sexual": 0.00015,
        "sexual/minors": 0.00001,
        "violence": 0.00023,
        "violence/graphic": 0.00002
      }
    }
  ],
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  },
  "cost": {
    "total_usd": 0.0000016,
    "price_per_million_tokens": 0.2
  }
}

Response Fields

idstring

Unique identifier for the moderation request

modelstring

The model used for moderation

resultsarray

Array of moderation results, one per input:

flagged: Boolean indicating if content violates policy
categories: Object with boolean for each category
category_scores: Object with confidence scores (0-1) for each category

usageobject

Token usage for billing

costobject

Cost breakdown: total_usd and price_per_million_tokens

Category	Description
`hate`	Content expressing hatred toward a group
`hate/threatening`	Hateful content with threats of violence
`harassment`	Content meant to harass or bully
`harassment/threatening`	Harassment with threats
`self-harm`	Content promoting self-harm
`self-harm/intent`	Expression of self-harm intent
`self-harm/instructions`	Instructions for self-harm
`sexual`	Sexually explicit content
`sexual/minors`	Sexual content involving minors
`violence`	Content depicting violence
`violence/graphic`	Graphic depictions of violence

Available Models

Model	Description	Price
`assisters-moderation-v1`	Advanced safety model with 14 categories	$0.05/M tokens

Moderation

Content Moderation

Endpoint

Request Body

Request Examples

Single Text

Batch Moderation

Pre-moderation Pattern

Response

Response Fields

Categories

Available Models

Model Details

Use Cases

Best Practices

Moderate Both Inputs and Outputs

Use Custom Thresholds

Batch for Efficiency

Cache Results

Error Responses

On this page

Moderation

Model Details

User Input Validation

AI Output Safety

Comment Filtering

Custom Thresholds

Moderate Both Inputs and Outputs

Use Custom Thresholds

Batch for Efficiency

Cache Results

400 Bad Request - Too Many Inputs

400 Bad Request - Empty Input

On this page