Kuro AI

Kuro AI is Whistle’s built-in content moderation engine. It analyzes submitted text against your project’s configured report categories and provides risk scoring, severity assessment, and automated report creation.

Kuro AI is available on the Pro and Enterprise plans. See kansato.com/pricing for details.

Features

Content flag detection — Score text before publish or on a schedule
Report enrichment — Classify and prioritize ingested reports
External link scanning — Follow URLs found in text when enabled
Configurable thresholds — Tune when auto-reports are created

How it works

Kuro AI classifies text against your report categories:

Your server calls the analysis endpoint with the content snippet.
The model returns severity, confidence, reasoning, and category hits.
If confidence meets your threshold (and the content is not treated as safe), Whistle can open a report automatically.

Severity levels

Level	Description
`LOW`	No policy violation likely. Normal content.
`MEDIUM`	Potentially problematic. Borderline content requiring context.
`HIGH`	Likely policy violation. Clearly violates common trust & safety policies.
`CRITICAL`	Severe violation. Illegal content, immediate threats, or urgent action required.

Confidence score

0–100: model certainty for the returned severity/categories. Tune thresholds using false-positive rate in your environment.

Configuration

In Settings → Kuro AI:

Enable AI analysis — Turn the engine on for the project.
Confidence threshold — Minimum confidence (1–100%) to auto-create a report.
Scan external links — Fetch and analyze URLs discovered in text.
Report categories — Maintain labels and optional AI descriptions under Settings → Moderation → Report categories.

Example category helper text:

Name: Harassment
AI Description: Content that targets individuals with abusive language, threats, or repeated unwanted contact

Content flag detection

Analyze a text snippet on demand (for example before publish).

Endpoint

POST /api/v1/organizations/{orgSlug}/projects/{projectId}/content

Request

Field	Type	Required	Description
`content`	string	Yes	Content text to analyze. Max 10,000 characters.
`externalId`	string	No	Your unique content identifier (for example a post ID).
`contentType`	string	No	Content type hint (for example `post`, `comment`, `message`).

{
  "content": "This post contains harmful targeted harassment against another user.",
  "externalId": "post_abc123",
  "contentType": "post"
}

Response

Field	Type	Description
`analysis.categories`	`string[]`	Matched report category names (may be empty).
`analysis.severity`	string	One of `LOW`, `MEDIUM`, `HIGH`, `CRITICAL`.
`analysis.confidence`	number	Confidence score 0–100.
`analysis.reasoning`	string	Brief explanation of the classification.
`reportCreated`	boolean	Whether a report was auto-created.
`reportId`	string \| null	Report ID if created, otherwise `null`.

{
  "analysis": {
    "categories": ["Harassment"],
    "severity": "HIGH",
    "confidence": 87,
    "reasoning": "This content contains targeted harassment with abusive language directed at another user."
  },
  "reportCreated": true,
  "reportId": "clx..."
}

Auto-report creation

When confidence meets your threshold, Whistle can create a report with:

Reason aligned to the detected category
Description summarizing severity, confidence, reasoning, and the source text
Target metadata from externalId / contentType when you supply them
automated: true (AI-sourced)

Errors

Status	Error
`400`	`Content is required`
`400`	`Content too long. Maximum 10000 characters.`
`400`	`Content analysis is not enabled for this project`
`400`	`No report categories configured for this project`
`404`	`Not found` — invalid `orgSlug` or `projectId`
`401`	`Unauthorized` — invalid or missing API key

Report Enrichment

When enabled, Kuro AI can enrich ingested reports asynchronously (severity, categories, reasoning). Depending on confidence, the system may move a report from OPEN into IN_REVIEW after enrichment completes—see the moderation pipeline in Moderation workflow.

What Gets Enriched

Report description text
Original submitted content (for API-submitted reports)
Reasoning and confidence scores
Matched categories from your configuration

Audit trail

AI enrichment and analysis emit audit events (for example AI_ANALYZED) with severity, confidence, categories, and reasoning for compliance review.

External link scanning

With Scan external links enabled, URLs in the submitted content string are fetched and scored.

How it works

Content is submitted to the flag detection endpoint
Kuro AI extracts all HTTP/HTTPS URLs from the text
Each URL is fetched and analyzed (asynchronously, in the background)
Reports are auto-created if the confidence threshold is met

Scan behavior

URLs are fetched with a 10-second timeout
Maximum response size: 100 KB
HTML content is stripped (scripts, styles, tags removed)
Content is cached for 24 hours per URL to avoid duplicate scans
Multiple URLs in the same content are scanned in parallel

Example

{
  "content": "Check out this post: https://example.com/spam-post and this one: https://example.com/harassment",
  "externalId": "comment_123",
  "contentType": "comment"
}

Both URLs will be automatically extracted and analyzed in the background.

Analysis history

List recent analyses for a project:

GET /api/v1/organizations/{orgSlug}/projects/{projectId}/content-analyses

Returns the 50 most recent rows with categories, severity, confidence, reasoning, and linked report IDs when present.

Integration patterns

Pre-publish moderation

Call the content endpoint from your backend before you persist or syndicate UGC:

const base = "https://api.kansato.com"; // or your self-hosted origin
const response = await fetch(
  `${base}/api/v1/organizations/${orgSlug}/projects/${projectId}/content`,
  {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      Authorization: `Bearer ${apiKey}`,
    },
    body: JSON.stringify({
      content: postContent,
      externalId: postId,
      contentType: "post",
    }),
  },
);

const result = await response.json();

if (result.analysis.severity === "CRITICAL" || result.analysis.confidence > 90) {
  return { allowed: false, reason: result.analysis.reasoning };
}

return { allowed: true, warning: result.analysis.reasoning };

Post-publish monitoring

The same endpoint works after content is live if you want asynchronous risk scoring and optional auto-report creation.

Text with URLs

Enable Scan External Links so URLs embedded in content are fetched and analyzed in the background (see External Link Scanning above).

Best practices

Start near the default threshold (for example 70%) and move it based on false positives.
Invest in short, concrete category descriptions— they steer the model more than long policy prose.
Sample-review automated reports weekly; automation drifts as slang and tactics change.
Keep humans in the loop for enforcement; AI should triage, not silently ban.
Watch low-confidence buckets— they usually mean ambiguous policy or missing categories.

Limitations

Text-first — Images/video are out of scope unless you transcribe or describe them in text.
Request size — Up to 10,000 characters accepted per call; the model analyzes a shorter inner window (see server limits).
Language — Best results in English today; expect variance elsewhere.
Context — Community norms and sarcasm still need human judgment.

Getting Started

Moderation

Webhooks

Features

How it works

Severity levels

Confidence score

Configuration

Content flag detection

Endpoint

Request

Response

Auto-report creation

Errors

Report Enrichment

What Gets Enriched

Audit trail

External link scanning

How it works

Scan behavior

Example

Analysis history

Integration patterns

Pre-publish moderation

Post-publish monitoring

Text with URLs

Best practices

Limitations

Getting Started

Moderation

Webhooks

Documentation Index

​Features

​How it works

​Severity levels

​Confidence score

​Configuration

​Content flag detection

​Endpoint

​Request

​Response

​Auto-report creation

​Errors

​Report Enrichment

​What Gets Enriched

​Audit trail

​External link scanning

​How it works

​Scan behavior

​Example

​Analysis history

​Integration patterns

​Pre-publish moderation

​Post-publish monitoring

​Text with URLs

​Best practices

​Limitations

Features

How it works

Severity levels

Confidence score

Configuration

Content flag detection

Endpoint

Request

Response

Auto-report creation

Errors

Report Enrichment

What Gets Enriched

Audit trail

External link scanning

How it works

Scan behavior

Example

Analysis history

Integration patterns

Pre-publish moderation

Post-publish monitoring

Text with URLs

Best practices

Limitations