Moderation
Detect harmful, inappropriate, or policy-violating content
Content Moderation
Automatically detect and filter harmful, inappropriate, or policy-violating content. Use this endpoint to protect your users and maintain community standards.
Endpoint
POST https://api.assisters.dev/v1/moderate
POST https://api.assisters.dev/v1/moderations/v1/moderations is the OpenAI-compatible alias.Request Body
stringdefault: assisters-moderation-v1The moderation model to use. See available models.
Example: assisters-moderation-v1
string | arrayrequiredThe text to moderate. Can be a single string or an array of up to 100 strings.
Request Examples
Single Text
from openai import OpenAI
client = OpenAI(
api_key="ask_your_api_key",
base_url="https://api.assisters.dev/v1"
)
response = client.moderations.create(
model="assisters-moderation-v1",
input="Hello, how are you today?"
)
result = response.results[0]
print(f"Flagged: {result.flagged}")
print(f"Categories: {result.categories}")import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'ask_your_api_key',
baseURL: 'https://api.assisters.dev/v1'
});
const response = await client.moderations.create({
model: 'assisters-moderation-v1',
input: 'Hello, how are you today?'
});
const result = response.results[0];
console.log(`Flagged: ${result.flagged}`);
console.log(`Categories:`, result.categories);curl https://api.assisters.dev/v1/moderate \
-H "Authorization: Bearer ask_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"model": "assisters-moderation-v1",
"input": "Hello, how are you today?"
}'Batch Moderation
response = client.moderations.create(
model="assisters-moderation-v1",
input=[
"First message to check",
"Second message to check",
"Third message to check"
]
)
for i, result in enumerate(response.results):
print(f"Message {i}: Flagged={result.flagged}")Pre-moderation Pattern
def moderate_before_response(user_message):
"""Check user input before processing"""
moderation = client.moderations.create(
model="assisters-moderation-v1",
input=user_message
)
if moderation.results[0].flagged:
return {
"error": "Your message violates our content policy",
"categories": moderation.results[0].categories
}
# Process the message normally
response = client.chat.completions.create(
model="assisters-chat-v1",
messages=[{"role": "user", "content": user_message}]
)
return {"response": response.choices[0].message.content}Response
{
"id": "modr-abc123xyz",
"model": "assisters-moderation-v1",
"results": [
{
"flagged": false,
"categories": {
"hate": false,
"hate/threatening": false,
"harassment": false,
"harassment/threatening": false,
"self-harm": false,
"self-harm/intent": false,
"self-harm/instructions": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"hate": 0.00012,
"hate/threatening": 0.00001,
"harassment": 0.00034,
"harassment/threatening": 0.00002,
"self-harm": 0.00001,
"self-harm/intent": 0.00001,
"self-harm/instructions": 0.00001,
"sexual": 0.00015,
"sexual/minors": 0.00001,
"violence": 0.00023,
"violence/graphic": 0.00002
}
}
],
"usage": {
"prompt_tokens": 8,
"total_tokens": 8
},
"cost": {
"total_usd": 0.0000016,
"price_per_million_tokens": 0.2
}
}Response Fields
idstringUnique identifier for the moderation request
modelstringThe model used for moderation
resultsarrayArray of moderation results, one per input:
flagged: Boolean indicating if content violates policycategories: Object with boolean for each categorycategory_scores: Object with confidence scores (0-1) for each category
usageobjectToken usage for billing
costobjectCost breakdown: total_usd and price_per_million_tokens
Categories
| Category | Description |
|---|---|
hate | Content expressing hatred toward a group |
hate/threatening | Hateful content with threats of violence |
harassment | Content meant to harass or bully |
harassment/threatening | Harassment with threats |
self-harm | Content promoting self-harm |
self-harm/intent | Expression of self-harm intent |
self-harm/instructions | Instructions for self-harm |
sexual | Sexually explicit content |
sexual/minors | Sexual content involving minors |
violence | Content depicting violence |
violence/graphic | Graphic depictions of violence |
Available Models
| Model | Description | Price |
|---|---|---|
assisters-moderation-v1 | Advanced safety model with 14 categories | $0.05/M tokens |
Model Details
See detailed model specifications
Use Cases
Best Practices
Moderate Both Inputs and Outputs
Check user messages AND AI responses for safety
Use Custom Thresholds
Adjust category_scores thresholds based on your use case
Batch for Efficiency
Send multiple texts in one request when possible
Cache Results
Cache moderation results for repeated content