Guardrails
On this page
Policy modelSetting policiesWhere guardrails runViewing violationsTesting policiesAPI referenceNotesGuardrails let you add content filtering to your deployed functions. Define policies that check inputs and outputs for PII, toxicity, prompt injection, and custom patterns. Policies can block requests, add warning headers, or log silently.
Policy model
A guardrail policy defines which checks to run and what action to take when a violation is detected.
{
"enabled": true,
"checks": [
{ "type": "pii", "target": "both" },
{ "type": "prompt-injection", "target": "input" },
{ "type": "max-tokens", "target": "output", "config": { "limit": 5000 } }
],
"action": "warn"
}Actions
| Action | Behavior |
|---|---|
block |
Returns 403 and prevents execution. Violations recorded. |
warn |
Adds X-Guardrail-Warnings header and allows execution. Violations recorded. |
log |
Allows execution silently. Violations recorded for audit. |
Check types
| Type | Description | Target | Severity |
|---|---|---|---|
pii |
Detects SSN, credit card, email, phone, AWS key patterns | input, output, both | high |
toxicity |
AI text classification for negative/toxic content | input, output, both | medium |
prompt-injection |
Heuristic detection of injection attempts (ignore instructions, role hijacking, system prompt leaks) | input, output, both | high |
regex |
Custom regex pattern via config.pattern and optional config.flags |
input, output, both | medium |
max-tokens |
Approximate token count (whitespace-split) vs config.limit (default 5000) |
input, output, both | low |
Setting policies
Organization default
Set a default policy that applies to all functions:
curl -X PUT https://api.fold.run/guardrails/default \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"policy": {
"enabled": true,
"checks": [
{ "type": "pii", "target": "both" },
{ "type": "prompt-injection", "target": "input" }
],
"action": "warn"
}
}'Per-function override
Override the default for a specific function:
curl -X PUT https://api.fold.run/guardrails/fn_abc123 \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"policy": {
"enabled": true,
"checks": [
{ "type": "pii", "target": "both" },
{ "type": "toxicity", "target": "output" },
{ "type": "prompt-injection", "target": "input" }
],
"action": "block"
}
}'Delete a function override to fall back to the organization default:
curl -X DELETE https://api.fold.run/guardrails/fn_abc123 \
-H "Authorization: Bearer $TOKEN"Where guardrails run
Guardrails are integrated into every dispatch path:
| Path | Pre-execution (input) | Post-execution (output) |
|---|---|---|
Function dispatch ({tenant}.fold.run/{fn}) |
Blocking | Fire-and-forget |
MCP tool calls (/.well-known/mcp) |
Blocking | Fire-and-forget |
A2A tasks (/_a2a/) |
Blocking | Fire-and-forget |
| Pipeline steps | Blocking (halts pipeline) | Fire-and-forget |
Input guardrails run before dispatch. If the action is block and violations are found, the request is rejected with a 403 response.
Output guardrails run after the response is received. They are always fire-and-forget (non-blocking) — violations are recorded but the response is returned to the caller. For streaming responses, output guardrails are skipped (the response body cannot be buffered).
Viewing violations
Dashboard
Navigate to Guardrails in the sidebar. The violations table shows recent violations with function name, direction, check type, and action taken.
API
# All violations
GET /guardrails/violations
# Filter by function
GET /guardrails/violations?function_id=fn_abc123Testing policies
Test guardrail checks against sample text without recording violations:
curl -X POST https://api.fold.run/guardrails/test \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "My SSN is 123-45-6789",
"checks": [{ "type": "pii", "target": "input" }],
"direction": "input"
}'Response:
{
"passed": false,
"violations": [
{
"check_type": "pii",
"direction": "input",
"details": "SSN pattern detected",
"severity": "high"
}
]
}API reference
| Method | Path | Description |
|---|---|---|
GET /guardrails |
List all policies | |
GET /guardrails/default |
Get organization default policy | |
PUT /guardrails/default |
Set organization default policy | |
GET /guardrails/:functionId |
Get function policy | |
PUT /guardrails/:functionId |
Set function policy | |
DELETE /guardrails/:functionId |
Remove function override | |
GET /guardrails/violations |
List violations | |
POST /guardrails/test |
Test checks (dry-run) |
Notes
- The
toxicitycheck requires the AI binding to be enabled. If not configured, it is silently skipped. - PII detection uses regex patterns (SSN, credit cards, emails, phone numbers, AWS keys). It does not use AI-based entity recognition.
- Prompt injection detection uses heuristic patterns. It catches common injection attempts but is not exhaustive.
- Violations are recorded per-organization and retained indefinitely.