Skip to main content

What is a Gate?

A gate is the core building block of Layer AI. It’s a configuration that sits between your application and AI providers, controlling how requests are routed, which models are used, and how failures are handled. When you make a request through Layer, you reference a gate by its ID (a UUID). The gate determines:
  • Which model handles the request
  • What happens if that model fails
  • What parameters are applied (temperature, max tokens, etc.)
  • Whether spending limits are enforced
  • What system prompt is prepended

Gate Types

Layer supports two gate types:
TypeUse Case
StandardDiscrete, stateless LLM calls — chat, image generation, embeddings, etc.
AgentMulti-turn agent workflows with session tracking, tool visibility, and per-session spending. See Agent Gates.

Task Types

Every gate is configured for a specific task type, which determines which models are available:
Task TypeDescriptionExample Models
chatText generation, conversationGPT-4o, Claude Sonnet, Gemini
imageImage generationDALL-E, Stable Diffusion
videoVideo generationRunway, Pika
ttsText-to-speechOpenAI TTS, ElevenLabs
embeddingsText embeddingstext-embedding-3, Ada
ocrDocument processingOpenAI Vision, Claude Vision
Chat gates also support optional subtypes for specialized models:
  • Reasoning — o3, o4-mini, Gemini 2.5 Pro
  • Code — Codestral, Devstral
  • Realtime — gpt-4o-realtime

Routing Strategies

Gates support three strategies for handling requests:

Single (Default)

Use only the primary model. No fallbacks. Simplest and most predictable.

Fallback

Try the primary model first. If it fails (provider outage, rate limit, etc.), try each fallback model in order until one succeeds.
Primary: claude-sonnet-4 → fails
Fallback 1: gpt-4o → fails
Fallback 2: gemini-2.0-flash → succeeds ✓

Round-Robin

Randomly distribute requests across the primary model and all fallback models. Useful for load balancing or informal cost distribution across providers.

Creating a Gate

From the Dashboard

  1. Go to Dashboard → Gates → Create New Gate
  2. Fill in the required fields:
    • Name — A human-readable label for the gate (e.g., customer-support). The gate ID (UUID) is what you use in API calls.
    • Task Type — What kind of requests this gate handles
    • Model — The primary model to use
  3. Optionally configure:
    • Fallback models and routing strategy
    • System prompt applied to all requests
    • Temperature, max tokens, and top P defaults
    • Spending limits with alert or block enforcement
    • Structured output (JSON schema) for consistent response formats

Via the API

curl -X POST https://api.uselayer.ai/v1/gates \
  -H "Authorization: Bearer layer_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "customer-support",
    "taskType": "chat",
    "model": "claude-sonnet-4-20250514",
    "description": "Handles customer support queries",
    "fallbackModels": ["gpt-4o"],
    "routingStrategy": "fallback",
    "temperature": 0.7,
    "systemPrompt": "You are a helpful customer support agent."
  }'

Using a Gate

Once created, reference the gate in your requests:
import { Layer } from '@layer-ai/sdk';

const layer = new Layer({
  apiKey: process.env.LAYER_API_KEY
});

const response = await layer.chat({
  gateId: 'your-gate-uuid',
  data: {
    messages: [
      { role: 'user', content: 'How do I reset my password?' }
    ]
  }
});

Gate Configuration Reference

The full gate configuration interface. Fields marked required must be provided when creating a gate. All other fields are optional.

Core Fields

FieldTypeRequiredDescription
namestringYesHuman-readable label for the gate (e.g., customer-support). Must be unique per user. Used for display in the dashboard and logs.
modelSupportedModelYesPrimary model to use for requests (e.g., claude-sonnet-4-20250514, gpt-4o).
taskTypeModelTypeYesDetermines which models are available. One of: chat, image, video, tts, embeddings, ocr, audio, stt, moderation.
taskSubtypeModelSubtypeNoSpecialization for chat models. One of: reasoning, code, realtime.
descriptionstringNoDescribes what this gate is for. Used by the Architect for smart routing recommendations. The dashboard offers an Auto-enhance description option — if you accept the Architect’s suggestion, the enhanced version replaces this field.
tagsstring[]NoOrganizational labels for filtering and grouping gates.

Model Routing

FieldTypeDefaultDescription
routingStrategy'single' | 'fallback' | 'round-robin''single'How requests are distributed across models.
fallbackModelsSupportedModel[][]Ordered list of fallback models. Used by fallback and round-robin strategies.

Request Parameters

Defaults applied to all requests through the gate. Can be overridden per-request if allowOverrides is configured.
FieldTypeDefaultDescription
systemPromptstringSystem prompt prepended to all requests.
temperaturenumberControls randomness (0 = deterministic, 2 = most random).
maxTokensnumberMaximum response length in tokens.
topPnumberNucleus sampling threshold (0 - 1). Lower = more focused.
allowOverridesboolean | OverrideConfigfalseWhether clients can override gate defaults per-request. Can be true (all overrides) or an object specifying which fields: { model?: boolean, temperature?: boolean, maxTokens?: boolean, topP?: boolean }.

Smart Routing (Architect)

Layer’s AI agent (the Architect) analyzes your gate’s description and usage patterns to recommend optimal models.
FieldTypeDefaultDescription
costWeightnumberWeight for cost optimization (0 - 1).
latencyWeightnumberWeight for latency optimization (0 - 1).
qualityWeightnumberWeight for quality optimization (0 - 1).
analysisMethod'cost' | 'balanced' | 'performance' | 'custom'Preset optimization profile. custom uses the individual weights above.
maxCostPer1kTokensnumberMaximum acceptable cost per 1,000 tokens (USD).
maxLatencyMsnumberMaximum acceptable latency in milliseconds.
reanalysisPeriod'daily' | 'weekly' | 'monthly' | 'never''never'How often the Architect re-evaluates model recommendations.
autoApplyRecommendationsbooleanfalseAutomatically apply Architect recommendations without manual review.
taskAnalysisTaskAnalysisRead-only. The Architect’s current recommendation including primary model, alternatives, and reasoning.

Structured Output

Force responses into a consistent format. Native JSON schema support for OpenAI models; prompt-injected for other providers.
FieldTypeDefaultDescription
responseFormatEnabledbooleanfalseEnable structured output.
responseFormatType'text' | 'json_object' | 'json_schema''text'Output format. json_object guarantees valid JSON. json_schema validates against a strict schema.
responseFormatSchemaobjectJSON schema for json_schema mode. Defines the exact structure of responses.

Spending Limits

Control costs at the gate level. See Spending for account-level controls.
FieldTypeDefaultDescription
spendingLimitnumber | nullnullDollar cap per period. null = no limit.
spendingLimitPeriod'monthly' | 'daily''monthly'How often the spending counter resets.
spendingEnforcement'alert_only' | 'block''alert_only'alert_only warns but allows requests. block rejects requests when the limit is hit.
Agent gates have additional configuration fields for sessions, sub-gates, and orchestration. See the Agent Gates documentation for those fields.

Read-Only Fields

These fields are returned by the API but cannot be set directly.
FieldTypeDescription
idstringGate UUID. This is what you use in API calls.
userIdstringOwner’s user ID.
createdAtDateWhen the gate was created.
updatedAtDateWhen the gate was last modified.
spendingCurrentnumberCurrent period spending (USD).
spendingPeriodStartstringStart of the current spending period.
spendingStatus'active' | 'suspended'Whether the gate is active or suspended due to spending limit.

Dashboard Tabs

When editing a gate in the dashboard, configuration is organized into tabs:

Basic Info

Name, description, task type, subtypes, and tags. Includes an Auto-enhance description toggle — when enabled, the Architect generates an improved version of your description optimized for smart routing analysis. If you accept the suggestion, it replaces your gate’s description field with the enhanced version.

Models & Routing

Primary model, fallback models, routing strategy, optimization weights, and smart routing (Architect) configuration.

Request Config

System prompt, temperature, max tokens, top P, and override permissions.

Spending Limits

Gate-level spending cap, period, and enforcement type.

Connections

Shows which agent gates use this gate as a sub-gate. If an agent gate references this gate for routing requests, it appears here with an AGENT badge and the agent gate’s name. You can Detach the gate from the agent gate directly from this tab. This is read-only context — you don’t configure connections here. Instead, sub-gate relationships are managed from the agent gate’s Sub-Gates tab. See Agent Gates for details on how sub-gates work.

Danger Zone

Delete the gate. This action is irreversible. If the gate is used as a sub-gate by any agent gates, it will be detached from them automatically.

Gate Limits

The number of gates you can create depends on your plan tier. Check your current usage at Dashboard → Settings.