Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.kansato.com/llms.txt

Use this file to discover all available pages before exploring further.

Kuro AI is Whistle’s built-in content moderation engine. It analyzes submitted text against your project’s configured report categories and provides risk scoring, severity assessment, and automated report creation.
Kuro AI is available on the Pro and Enterprise plans. See kansato.com/pricing for details.

Features

  • Content flag detection — Score text before publish or on a schedule
  • Report enrichment — Classify and prioritize ingested reports
  • External link scanning — Follow URLs found in text when enabled
  • Configurable thresholds — Tune when auto-reports are created

How it works

Kuro AI classifies text against your report categories:
  1. Your server calls the analysis endpoint with the content snippet.
  2. The model returns severity, confidence, reasoning, and category hits.
  3. If confidence meets your threshold (and the content is not treated as safe), Whistle can open a report automatically.

Severity levels

LevelDescription
LOWNo policy violation likely. Normal content.
MEDIUMPotentially problematic. Borderline content requiring context.
HIGHLikely policy violation. Clearly violates common trust & safety policies.
CRITICALSevere violation. Illegal content, immediate threats, or urgent action required.

Confidence score

0–100: model certainty for the returned severity/categories. Tune thresholds using false-positive rate in your environment.

Configuration

In Settings → Kuro AI:
  1. Enable AI analysis — Turn the engine on for the project.
  2. Confidence threshold — Minimum confidence (1–100%) to auto-create a report.
  3. Scan external links — Fetch and analyze URLs discovered in text.
  4. Report categories — Maintain labels and optional AI descriptions under Settings → Moderation → Report categories.
Example category helper text:
Name: Harassment
AI Description: Content that targets individuals with abusive language, threats, or repeated unwanted contact

Content flag detection

Analyze a text snippet on demand (for example before publish).

Endpoint

POST /api/v1/organizations/{orgSlug}/projects/{projectId}/content

Request

FieldTypeRequiredDescription
contentstringYesContent text to analyze. Max 10,000 characters.
externalIdstringNoYour unique content identifier (for example a post ID).
contentTypestringNoContent type hint (for example post, comment, message).
{
  "content": "This post contains harmful targeted harassment against another user.",
  "externalId": "post_abc123",
  "contentType": "post"
}

Response

FieldTypeDescription
analysis.categoriesstring[]Matched report category names (may be empty).
analysis.severitystringOne of LOW, MEDIUM, HIGH, CRITICAL.
analysis.confidencenumberConfidence score 0–100.
analysis.reasoningstringBrief explanation of the classification.
reportCreatedbooleanWhether a report was auto-created.
reportIdstring | nullReport ID if created, otherwise null.
{
  "analysis": {
    "categories": ["Harassment"],
    "severity": "HIGH",
    "confidence": 87,
    "reasoning": "This content contains targeted harassment with abusive language directed at another user."
  },
  "reportCreated": true,
  "reportId": "clx..."
}

Auto-report creation

When confidence meets your threshold, Whistle can create a report with:
  • Reason aligned to the detected category
  • Description summarizing severity, confidence, reasoning, and the source text
  • Target metadata from externalId / contentType when you supply them
  • automated: true (AI-sourced)

Errors

StatusError
400Content is required
400Content too long. Maximum 10000 characters.
400Content analysis is not enabled for this project
400No report categories configured for this project
404Not found — invalid orgSlug or projectId
401Unauthorized — invalid or missing API key

Report Enrichment

When enabled, Kuro AI can enrich ingested reports asynchronously (severity, categories, reasoning). Depending on confidence, the system may move a report from OPEN into IN_REVIEW after enrichment completes—see the moderation pipeline in Moderation workflow.

What Gets Enriched

  • Report description text
  • Original submitted content (for API-submitted reports)
  • Reasoning and confidence scores
  • Matched categories from your configuration

Audit trail

AI enrichment and analysis emit audit events (for example AI_ANALYZED) with severity, confidence, categories, and reasoning for compliance review. With Scan external links enabled, URLs in the submitted content string are fetched and scored.

How it works

  1. Content is submitted to the flag detection endpoint
  2. Kuro AI extracts all HTTP/HTTPS URLs from the text
  3. Each URL is fetched and analyzed (asynchronously, in the background)
  4. Reports are auto-created if the confidence threshold is met

Scan behavior

  • URLs are fetched with a 10-second timeout
  • Maximum response size: 100 KB
  • HTML content is stripped (scripts, styles, tags removed)
  • Content is cached for 24 hours per URL to avoid duplicate scans
  • Multiple URLs in the same content are scanned in parallel

Example

{
  "content": "Check out this post: https://example.com/spam-post and this one: https://example.com/harassment",
  "externalId": "comment_123",
  "contentType": "comment"
}
Both URLs will be automatically extracted and analyzed in the background.

Analysis history

List recent analyses for a project:
GET /api/v1/organizations/{orgSlug}/projects/{projectId}/content-analyses
Returns the 50 most recent rows with categories, severity, confidence, reasoning, and linked report IDs when present.

Integration patterns

Pre-publish moderation

Call the content endpoint from your backend before you persist or syndicate UGC:
const base = "https://api.kansato.com"; // or your self-hosted origin
const response = await fetch(
  `${base}/api/v1/organizations/${orgSlug}/projects/${projectId}/content`,
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify({
      content: postContent,
      externalId: postId,
      contentType: "post",
    }),
  },
);

const result = await response.json();

if (result.analysis.severity === "CRITICAL" || result.analysis.confidence > 90) {
  return { allowed: false, reason: result.analysis.reasoning };
}

return { allowed: true, warning: result.analysis.reasoning };

Post-publish monitoring

The same endpoint works after content is live if you want asynchronous risk scoring and optional auto-report creation.

Text with URLs

Enable Scan External Links so URLs embedded in content are fetched and analyzed in the background (see External Link Scanning above).

Best practices

  1. Start near the default threshold (for example 70%) and move it based on false positives.
  2. Invest in short, concrete category descriptions— they steer the model more than long policy prose.
  3. Sample-review automated reports weekly; automation drifts as slang and tactics change.
  4. Keep humans in the loop for enforcement; AI should triage, not silently ban.
  5. Watch low-confidence buckets— they usually mean ambiguous policy or missing categories.

Limitations

  • Text-first — Images/video are out of scope unless you transcribe or describe them in text.
  • Request size — Up to 10,000 characters accepted per call; the model analyzes a shorter inner window (see server limits).
  • Language — Best results in English today; expect variance elsewhere.
  • Context — Community norms and sarcasm still need human judgment.