Documentation Index
Fetch the complete documentation index at: https://docs.kansato.com/llms.txt
Use this file to discover all available pages before exploring further.
Kuro AI is Whistle’s built-in content moderation engine. It analyzes submitted text against your project’s configured report categories and provides risk scoring, severity assessment, and automated report creation.
Features
- Content flag detection — Score text before publish or on a schedule
- Report enrichment — Classify and prioritize ingested reports
- External link scanning — Follow URLs found in text when enabled
- Configurable thresholds — Tune when auto-reports are created
How it works
Kuro AI classifies text against your report categories:
- Your server calls the analysis endpoint with the content snippet.
- The model returns severity, confidence, reasoning, and category hits.
- If confidence meets your threshold (and the content is not treated as safe), Whistle can open a report automatically.
Severity levels
| Level | Description |
|---|
LOW | No policy violation likely. Normal content. |
MEDIUM | Potentially problematic. Borderline content requiring context. |
HIGH | Likely policy violation. Clearly violates common trust & safety policies. |
CRITICAL | Severe violation. Illegal content, immediate threats, or urgent action required. |
Confidence score
0–100: model certainty for the returned severity/categories. Tune thresholds using false-positive rate in your environment.
Configuration
In Settings → Kuro AI:
- Enable AI analysis — Turn the engine on for the project.
- Confidence threshold — Minimum confidence (1–100%) to auto-create a report.
- Scan external links — Fetch and analyze URLs discovered in text.
- Report categories — Maintain labels and optional AI descriptions under Settings → Moderation → Report categories.
Example category helper text:
Name: Harassment
AI Description: Content that targets individuals with abusive language, threats, or repeated unwanted contact
Content flag detection
Analyze a text snippet on demand (for example before publish).
Endpoint
POST /api/v1/organizations/{orgSlug}/projects/{projectId}/content
Request
| Field | Type | Required | Description |
|---|
content | string | Yes | Content text to analyze. Max 10,000 characters. |
externalId | string | No | Your unique content identifier (for example a post ID). |
contentType | string | No | Content type hint (for example post, comment, message). |
{
"content": "This post contains harmful targeted harassment against another user.",
"externalId": "post_abc123",
"contentType": "post"
}
Response
| Field | Type | Description |
|---|
analysis.categories | string[] | Matched report category names (may be empty). |
analysis.severity | string | One of LOW, MEDIUM, HIGH, CRITICAL. |
analysis.confidence | number | Confidence score 0–100. |
analysis.reasoning | string | Brief explanation of the classification. |
reportCreated | boolean | Whether a report was auto-created. |
reportId | string | null | Report ID if created, otherwise null. |
{
"analysis": {
"categories": ["Harassment"],
"severity": "HIGH",
"confidence": 87,
"reasoning": "This content contains targeted harassment with abusive language directed at another user."
},
"reportCreated": true,
"reportId": "clx..."
}
Auto-report creation
When confidence meets your threshold, Whistle can create a report with:
- Reason aligned to the detected category
- Description summarizing severity, confidence, reasoning, and the source text
- Target metadata from
externalId / contentType when you supply them
automated: true (AI-sourced)
Errors
| Status | Error |
|---|
400 | Content is required |
400 | Content too long. Maximum 10000 characters. |
400 | Content analysis is not enabled for this project |
400 | No report categories configured for this project |
404 | Not found — invalid orgSlug or projectId |
401 | Unauthorized — invalid or missing API key |
Report Enrichment
When enabled, Kuro AI can enrich ingested reports asynchronously (severity, categories, reasoning). Depending on confidence, the system may move a report from OPEN into IN_REVIEW after enrichment completes—see the moderation pipeline in Moderation workflow.
What Gets Enriched
- Report description text
- Original submitted content (for API-submitted reports)
- Reasoning and confidence scores
- Matched categories from your configuration
Audit trail
AI enrichment and analysis emit audit events (for example AI_ANALYZED) with severity, confidence, categories, and reasoning for compliance review.
External link scanning
With Scan external links enabled, URLs in the submitted content string are fetched and scored.
How it works
- Content is submitted to the flag detection endpoint
- Kuro AI extracts all HTTP/HTTPS URLs from the text
- Each URL is fetched and analyzed (asynchronously, in the background)
- Reports are auto-created if the confidence threshold is met
Scan behavior
- URLs are fetched with a 10-second timeout
- Maximum response size: 100 KB
- HTML content is stripped (scripts, styles, tags removed)
- Content is cached for 24 hours per URL to avoid duplicate scans
- Multiple URLs in the same content are scanned in parallel
Example
{
"content": "Check out this post: https://example.com/spam-post and this one: https://example.com/harassment",
"externalId": "comment_123",
"contentType": "comment"
}
Both URLs will be automatically extracted and analyzed in the background.
Analysis history
List recent analyses for a project:
GET /api/v1/organizations/{orgSlug}/projects/{projectId}/content-analyses
Returns the 50 most recent rows with categories, severity, confidence, reasoning, and linked report IDs when present.
Integration patterns
Pre-publish moderation
Call the content endpoint from your backend before you persist or syndicate UGC:
const base = "https://api.kansato.com"; // or your self-hosted origin
const response = await fetch(
`${base}/api/v1/organizations/${orgSlug}/projects/${projectId}/content`,
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${apiKey}`,
},
body: JSON.stringify({
content: postContent,
externalId: postId,
contentType: "post",
}),
},
);
const result = await response.json();
if (result.analysis.severity === "CRITICAL" || result.analysis.confidence > 90) {
return { allowed: false, reason: result.analysis.reasoning };
}
return { allowed: true, warning: result.analysis.reasoning };
Post-publish monitoring
The same endpoint works after content is live if you want asynchronous risk scoring and optional auto-report creation.
Text with URLs
Enable Scan External Links so URLs embedded in content are fetched and analyzed in the background (see External Link Scanning above).
Best practices
- Start near the default threshold (for example 70%) and move it based on false positives.
- Invest in short, concrete category descriptions— they steer the model more than long policy prose.
- Sample-review automated reports weekly; automation drifts as slang and tactics change.
- Keep humans in the loop for enforcement; AI should triage, not silently ban.
- Watch low-confidence buckets— they usually mean ambiguous policy or missing categories.
Limitations
- Text-first — Images/video are out of scope unless you transcribe or describe them in text.
- Request size — Up to 10,000 characters accepted per call; the model analyzes a shorter inner window (see server limits).
- Language — Best results in English today; expect variance elsewhere.
- Context — Community norms and sarcasm still need human judgment.