Guardrails

Guardrails let you add content filtering to your deployed functions. Define policies that check inputs and outputs for PII, toxicity, prompt injection, and custom patterns. Policies can block requests, add warning headers, or log silently.

Policy model

A guardrail policy defines which checks to run and what action to take when a violation is detected.

{
  "enabled": true,
  "checks": [
    { "type": "pii", "target": "both" },
    { "type": "prompt-injection", "target": "input" },
    { "type": "max-tokens", "target": "output", "config": { "limit": 5000 } }
  ],
  "action": "warn"
}

Actions

Action Behavior
block Returns 403 and prevents execution. Violations recorded.
warn Adds X-Guardrail-Warnings header and allows execution. Violations recorded.
log Allows execution silently. Violations recorded for audit.

Check types

Type Description Target Severity
pii Detects SSN, credit card, email, phone, AWS key patterns input, output, both high
toxicity AI text classification for negative/toxic content input, output, both medium
prompt-injection Heuristic detection of injection attempts (ignore instructions, role hijacking, system prompt leaks) input, output, both high
regex Custom regex pattern via config.pattern and optional config.flags input, output, both medium
max-tokens Approximate token count (whitespace-split) vs config.limit (default 5000) input, output, both low

Setting policies

Organization default

Set a default policy that applies to all functions:

curl -X PUT https://api.fold.run/guardrails/default \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "enabled": true,
      "checks": [
        { "type": "pii", "target": "both" },
        { "type": "prompt-injection", "target": "input" }
      ],
      "action": "warn"
    }
  }'

Per-function override

Override the default for a specific function:

curl -X PUT https://api.fold.run/guardrails/fn_abc123 \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "enabled": true,
      "checks": [
        { "type": "pii", "target": "both" },
        { "type": "toxicity", "target": "output" },
        { "type": "prompt-injection", "target": "input" }
      ],
      "action": "block"
    }
  }'

Delete a function override to fall back to the organization default:

curl -X DELETE https://api.fold.run/guardrails/fn_abc123 \
  -H "Authorization: Bearer $TOKEN"

Where guardrails run

Guardrails are integrated into every dispatch path:

Path Pre-execution (input) Post-execution (output)
Function dispatch ({tenant}.fold.run/{fn}) Blocking Fire-and-forget
MCP tool calls (/.well-known/mcp) Blocking Fire-and-forget
A2A tasks (/_a2a/) Blocking Fire-and-forget
Pipeline steps Blocking (halts pipeline) Fire-and-forget

Input guardrails run before dispatch. If the action is block and violations are found, the request is rejected with a 403 response.

Output guardrails run after the response is received. They are always fire-and-forget (non-blocking) — violations are recorded but the response is returned to the caller. For streaming responses, output guardrails are skipped (the response body cannot be buffered).

Viewing violations

Dashboard

Navigate to Guardrails in the sidebar. The violations table shows recent violations with function name, direction, check type, and action taken.

API

# All violations
GET /guardrails/violations

# Filter by function
GET /guardrails/violations?function_id=fn_abc123

Testing policies

Test guardrail checks against sample text without recording violations:

curl -X POST https://api.fold.run/guardrails/test \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "My SSN is 123-45-6789",
    "checks": [{ "type": "pii", "target": "input" }],
    "direction": "input"
  }'

Response:

{
  "passed": false,
  "violations": [
    {
      "check_type": "pii",
      "direction": "input",
      "details": "SSN pattern detected",
      "severity": "high"
    }
  ]
}

API reference

Method Path Description
GET /guardrails List all policies
GET /guardrails/default Get organization default policy
PUT /guardrails/default Set organization default policy
GET /guardrails/:functionId Get function policy
PUT /guardrails/:functionId Set function policy
DELETE /guardrails/:functionId Remove function override
GET /guardrails/violations List violations
POST /guardrails/test Test checks (dry-run)

Notes

  • The toxicity check requires the AI binding to be enabled. If not configured, it is silently skipped.
  • PII detection uses regex patterns (SSN, credit cards, emails, phone numbers, AWS keys). It does not use AI-based entity recognition.
  • Prompt injection detection uses heuristic patterns. It catches common injection attempts but is not exhaustive.
  • Violations are recorded per-organization and retained indefinitely.