AI models and features

fold.run includes a curated catalog of AI models and AI-powered capabilities for code generation and error diagnosis. AI features are available on the Pro plan.

Supported models

Every model uses a @fold/ prefixed ID. Pass the model ID to fold.ai.run() in your function code.

Text generation

Model ID Name Notes
@fold/meta/llama-3.3-70b-instruct-fp8-fast Llama 3.3 70B Flagship large model, best overall quality
@fold/meta/llama-4-scout-17b-16e-instruct Llama 4 Scout 17B Latest-generation model with mixture of experts
@fold/meta/llama-3.1-8b-instruct Llama 3.1 8B Efficient general-purpose model
@fold/meta/llama-3.2-3b-instruct Llama 3.2 3B Lightweight, fast responses
@fold/qwen/qwen2.5-coder-32b-instruct Qwen 2.5 Coder 32B Optimized for code generation
@fold/qwen/qwq-32b QwQ 32B Reasoning-focused model
@fold/mistralai/mistral-small-3.1-24b-instruct Mistral Small 3.1 24B Strong multilingual support
@fold/google/gemma-3-12b-it Gemma 3 12B Balanced size and performance
@fold/deepseek-ai/deepseek-r1-distill-qwen-32b DeepSeek R1 32B Advanced reasoning capabilities
@fold/openai/gpt-oss-120b GPT OSS 120B Largest available open model

Embeddings

Model ID Name Notes
@fold/baai/bge-base-en-v1.5 BGE Base EN Standard English embeddings
@fold/baai/bge-large-en-v1.5 BGE Large EN Higher-quality English embeddings
@fold/baai/bge-m3 BGE M3 Multilingual embeddings

Image generation

Model ID Name Notes
@fold/black-forest-labs/flux-1-schnell FLUX.1 Schnell Fast, high-quality generation
@fold/stabilityai/stable-diffusion-xl-base-1.0 Stable Diffusion XL Versatile image generation
@fold/leonardo/phoenix-1.0 Leonardo Phoenix Creative image generation

Vision

Model ID Name Notes
@fold/meta/llama-3.2-11b-vision-instruct Llama 3.2 11B Vision Image understanding and analysis

Speech to text

Model ID Name Notes
@fold/openai/whisper-large-v3-turbo Whisper Large V3 Turbo Best accuracy transcription
@fold/deepgram/nova-3 Deepgram Nova 3 Real-time transcription

Text to speech

Model ID Name Notes
@fold/deepgram/aura-2-en Deepgram Aura 2 EN High-quality English speech
@fold/myshell-ai/melotts MeloTTS Lightweight text-to-speech

Classification

Model ID Name Notes
@fold/huggingface/distilbert-sst-2-int8 DistilBERT Sentiment Sentiment analysis
@fold/microsoft/resnet-50 ResNet-50 Image classification

Summarization

Model ID Name Notes
@fold/facebook/bart-large-cnn BART Large CNN Document summarization

Translation

Model ID Name Notes
@fold/meta/m2m100-1.2b M2M100 1.2B Multilingual translation

Reranking

Model ID Name Notes
@fold/baai/bge-reranker-base BGE Reranker Base Search result reranking

Safety

Model ID Name Notes
@fold/meta/llama-guard-3-8b Llama Guard 3 Content safety classification

Model lifecycle

Each model has a lifecycle status:

  • Active — fully supported and recommended for use.
  • Deprecated — still works, but you should migrate to the recommended successor.
  • Retiring — will stop working on the listed retirement date. Blocked on new deploys.
  • Retired — no longer available. Blocked on all deploys.

When a model is deprecated or retiring, deploy warnings will indicate the successor model and any deadline. All models listed above are currently active.

Code generation

Generate a complete function from a natural language description:

curl -X POST https://api.fold.run/ai/generate \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "An API that returns the current time in multiple timezones"
  }'

Response:

{
  "code": "import { defineHandler, type FoldContext } from '@fold-run/runtime';\n\nexport default defineHandler(async (fold: FoldContext) => {\n  // ...\n});",
  "description": "Returns current time for UTC, EST, PST, and CET timezones"
}

The generated code always uses @fold-run/runtime's defineHandler pattern with no external dependencies.

Tool generation

Generate a function with an input schema, ready for MCP or A2A:

curl -X POST https://api.fold.run/ai/generate-tool \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Look up the weather for a given city",
    "input_schema": {
      "type": "object",
      "properties": {
        "city": { "type": "string" }
      }
    }
  }'

Response includes code, description, and the finalized input_schema.

Via the CLI

# Generate and deploy in one step
fold create-tool weather-lookup "Look up weather for a city"

Error diagnosis

When a function is throwing errors, AI can analyze the source code and recent error logs to suggest a fix:

curl -X POST https://api.fold.run/functions/fn_abc123/diagnose \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "function_id": "fn_abc123",
  "error_count": 10,
  "diagnosis": "The function is failing because `env.API_KEY` is undefined. You need to add an API_KEY secret for this organization before deploying. Run: curl -X POST https://api.fold.run/secrets -H 'Authorization: Bearer <token>' -d '{\"name\": \"API_KEY\", \"value\": \"...\"}'"
}

The diagnosis examines the last 10 error activations and the function's source code to provide actionable suggestions.

Via MCP

AI agents can diagnose errors using the MCP diagnose_function tool:

"Diagnose the errors on my weather-lookup function"

See MCP integration for details.

Plan requirement

AI features require the Pro plan. On the Free plan, AI endpoints return a 403 with error code PRO_PLAN_REQUIRED.

The Pro plan includes 5,000,000 AI tokens per month. Usage beyond this cap is metered at $10.00 per 1,000,000 tokens (see Billing for details). The Free plan has no AI token allowance.

Upgrade from the billing page or see Billing.

Reliability

AI features are fire-and-forget — they never block deploys or activations. If the AI model is temporarily unavailable, the platform continues operating normally without AI-generated summaries or diagnoses.