Guardrails

Policy model

A guardrail policy defines which checks to run and what action to take when a violation is detected.

{
  "enabled": true,
  "checks": [
    { "type": "pii", "target": "both" },
    { "type": "prompt-injection", "target": "input" },
    { "type": "max-tokens", "target": "output", "config": { "limit": 5000 } }
  ],
  "action": "warn"
}

Actions

Action	Behavior
`block`	Returns 403 and prevents execution. Violations recorded.
`warn`	Adds `X-Guardrail-Warnings` header and allows execution. Violations recorded.
`log`	Allows execution silently. Violations recorded for audit.

Check types

Type	Description	Target	Severity
`pii`	Detects SSN, credit card, email, phone, AWS key patterns	input, output, both	high
`toxicity`	AI text classification for negative/toxic content	input, output, both	medium
`prompt-injection`	Heuristic detection of injection attempts (ignore instructions, role hijacking, system prompt leaks)	input, output, both	high
`regex`	Custom regex pattern via `config.pattern` and optional `config.flags`	input, output, both	medium
`max-tokens`	Approximate token count (whitespace-split) vs `config.limit` (default 5000)	input, output, both	low

Setting policies

Organization default

Set a default policy that applies to all functions:

curl -X PUT https://api.fold.run/guardrails/default \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "enabled": true,
      "checks": [
        { "type": "pii", "target": "both" },
        { "type": "prompt-injection", "target": "input" }
      ],
      "action": "warn"
    }
  }'

Per-function override

Override the default for a specific function:

curl -X PUT https://api.fold.run/guardrails/fn_abc123 \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "enabled": true,
      "checks": [
        { "type": "pii", "target": "both" },
        { "type": "toxicity", "target": "output" },
        { "type": "prompt-injection", "target": "input" }
      ],
      "action": "block"
    }
  }'

Delete a function override to fall back to the organization default:

curl -X DELETE https://api.fold.run/guardrails/fn_abc123 \
  -H "Authorization: Bearer $TOKEN"

Where guardrails run

Guardrails are integrated into every dispatch path:

Path	Pre-execution (input)	Post-execution (output)
Function dispatch (`{tenant}.fold.run/{fn}`)	Blocking	Fire-and-forget
MCP tool calls (`/.well-known/mcp`)	Blocking	Fire-and-forget
A2A tasks (`/_a2a/`)	Blocking	Fire-and-forget
Pipeline steps	Blocking (halts pipeline)	Fire-and-forget

Input guardrails run before dispatch. If the action is block and violations are found, the request is rejected with a 403 response.

Output guardrails run after the response is received. They are always fire-and-forget (non-blocking) — violations are recorded but the response is returned to the caller. For streaming responses, output guardrails are skipped (the response body cannot be buffered).

Viewing violations

Console

Navigate to Guardrails in the sidebar. The violations table shows recent violations with function name, direction, check type, and action taken.

API

# All violations
GET /guardrails/violations

# Filter by function
GET /guardrails/violations?function_id=fn_abc123

Testing policies

Test guardrail checks against sample text without recording violations:

curl -X POST https://api.fold.run/guardrails/test \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "My SSN is 123-45-6789",
    "checks": [{ "type": "pii", "target": "input" }],
    "direction": "input"
  }'

Response:

{
  "passed": false,
  "violations": [
    {
      "check_type": "pii",
      "direction": "input",
      "details": "SSN pattern detected",
      "severity": "high"
    }
  ]
}

API reference

Method	Path	Description
`GET /guardrails`	List all policies
`GET /guardrails/default`	Get organization default policy
`PUT /guardrails/default`	Set organization default policy
`GET /guardrails/:functionId`	Get function policy
`PUT /guardrails/:functionId`	Set function policy
`DELETE /guardrails/:functionId`	Remove function override
`GET /guardrails/violations`	List violations
`POST /guardrails/test`	Test checks (dry-run)

Notes

The toxicity check requires the AI binding to be enabled. If not configured, it is silently skipped.
PII detection uses regex patterns (SSN, credit cards, emails, phone numbers, AWS keys). It does not use AI-based entity recognition.
Prompt injection detection uses heuristic patterns. It catches common injection attempts but is not exhaustive.
Violations are recorded per-organization and retained indefinitely.