Gates

What is a Gate?

A gate is the core building block of Layer AI. It’s a configuration that sits between your application and AI providers, controlling how requests are routed, which models are used, and how failures are handled. When you make a request through Layer, you reference a gate by its ID (a UUID). The gate determines:

Which model handles the request
What happens if that model fails
What parameters are applied (temperature, max tokens, etc.)
Whether spending limits are enforced
What system prompt is prepended

Gate Types

Layer supports two gate types:

Type	Use Case
Standard	Discrete, stateless LLM calls — chat, image generation, embeddings, etc.
Agent	Multi-turn agent workflows with session tracking, tool visibility, and per-session spending. See Agent Gates.

Task Types

Every gate is configured for a specific task type, which determines which models are available:

Task Type	Description	Example Models
`chat`	Text generation, conversation	GPT-4o, Claude Sonnet, Gemini
`image`	Image generation	DALL-E, Stable Diffusion
`video`	Video generation	Runway, Pika
`tts`	Text-to-speech	OpenAI TTS, ElevenLabs
`embeddings`	Text embeddings	text-embedding-3, Ada
`ocr`	Document processing	OpenAI Vision, Claude Vision

Chat gates also support optional subtypes for specialized models:

Reasoning — o3, o4-mini, Gemini 2.5 Pro
Code — Codestral, Devstral
Realtime — gpt-4o-realtime

Routing Strategies

Gates support three strategies for handling requests:

Single (Default)

Use only the primary model. No fallbacks. Simplest and most predictable.

Fallback

Try the primary model first. If it fails (provider outage, rate limit, etc.), try each fallback model in order until one succeeds.

Primary: claude-sonnet-4 → fails
Fallback 1: gpt-4o → fails
Fallback 2: gemini-2.0-flash → succeeds ✓

Round-Robin

Randomly distribute requests across the primary model and all fallback models. Useful for load balancing or informal cost distribution across providers.

Creating a Gate

From the Dashboard

Go to Dashboard → Gates → Create New Gate
Fill in the required fields:
- Name — A human-readable label for the gate (e.g., customer-support). The gate ID (UUID) is what you use in API calls.
- Task Type — What kind of requests this gate handles
- Model — The primary model to use
Optionally configure:
- Fallback models and routing strategy
- System prompt applied to all requests
- Temperature, max tokens, and top P defaults
- Spending limits with alert or block enforcement
- Structured output (JSON schema) for consistent response formats

Via the API

curl -X POST https://api.uselayer.ai/v1/gates \
  -H "Authorization: Bearer layer_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "customer-support",
    "taskType": "chat",
    "model": "claude-sonnet-4-20250514",
    "description": "Handles customer support queries",
    "fallbackModels": ["gpt-4o"],
    "routingStrategy": "fallback",
    "temperature": 0.7,
    "systemPrompt": "You are a helpful customer support agent."
  }'

Using a Gate

Once created, reference the gate in your requests:

Layer SDK
OpenAI SDK
cURL

import { Layer } from '@layer-ai/sdk';

const layer = new Layer({
  apiKey: process.env.LAYER_API_KEY
});

const response = await layer.chat({
  gateId: 'your-gate-uuid',
  data: {
    messages: [
      { role: 'user', content: 'How do I reset my password?' }
    ]
  }
});

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'https://api.uselayer.ai/v1',
  apiKey: process.env.LAYER_API_KEY,
  defaultHeaders: {
    'x-layer-gate-id': 'your-gate-uuid'
  }
});

const response = await openai.chat.completions.create({
  model: 'claude-sonnet-4-20250514',
  messages: [
    { role: 'user', content: 'How do I reset my password?' }
  ]
});

curl -X POST https://api.uselayer.ai/v3/chat \
  -H "Authorization: Bearer layer_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "gateId": "your-gate-uuid",
    "data": {
      "messages": [
        { "role": "user", "content": "How do I reset my password?" }
      ]
    }
  }'

Gate Configuration Reference

The full gate configuration interface. Fields marked required must be provided when creating a gate. All other fields are optional.

Core Fields

Field	Type	Required	Description
`name`	`string`	Yes	Human-readable label for the gate (e.g., `customer-support`). Must be unique per user. Used for display in the dashboard and logs.
`model`	`SupportedModel`	Yes	Primary model to use for requests (e.g., `claude-sonnet-4-20250514`, `gpt-4o`).
`taskType`	`ModelType`	Yes	Determines which models are available. One of: `chat`, `image`, `video`, `tts`, `embeddings`, `ocr`, `audio`, `stt`, `moderation`.
`taskSubtype`	`ModelSubtype`	No	Specialization for chat models. One of: `reasoning`, `code`, `realtime`.
`description`	`string`	No	Describes what this gate is for. Used by the Architect for smart routing recommendations. The dashboard offers an Auto-enhance description option — if you accept the Architect’s suggestion, the enhanced version replaces this field.
`tags`	`string[]`	No	Organizational labels for filtering and grouping gates.

Model Routing

Field	Type	Default	Description
`routingStrategy`	`'single' \| 'fallback' \| 'round-robin'`	`'single'`	How requests are distributed across models.
`fallbackModels`	`SupportedModel[]`	`[]`	Ordered list of fallback models. Used by `fallback` and `round-robin` strategies.

Request Parameters

Defaults applied to all requests through the gate. Can be overridden per-request if allowOverrides is configured.

Field	Type	Default	Description
`systemPrompt`	`string`	—	System prompt prepended to all requests.
`temperature`	`number`	—	Controls randomness (0 = deterministic, 2 = most random).
`maxTokens`	`number`	—	Maximum response length in tokens.
`topP`	`number`	—	Nucleus sampling threshold (0 - 1). Lower = more focused.
`allowOverrides`	`boolean \| OverrideConfig`	`false`	Whether clients can override gate defaults per-request. Can be `true` (all overrides) or an object specifying which fields: `{ model?: boolean, temperature?: boolean, maxTokens?: boolean, topP?: boolean }`.

Smart Routing (Architect)

Layer’s AI agent (the Architect) analyzes your gate’s description and usage patterns to recommend optimal models.

Field	Type	Default	Description
`costWeight`	`number`	—	Weight for cost optimization (0 - 1).
`latencyWeight`	`number`	—	Weight for latency optimization (0 - 1).
`qualityWeight`	`number`	—	Weight for quality optimization (0 - 1).
`analysisMethod`	`'cost' \| 'balanced' \| 'performance' \| 'custom'`	—	Preset optimization profile. `custom` uses the individual weights above.
`maxCostPer1kTokens`	`number`	—	Maximum acceptable cost per 1,000 tokens (USD).
`maxLatencyMs`	`number`	—	Maximum acceptable latency in milliseconds.
`reanalysisPeriod`	`'daily' \| 'weekly' \| 'monthly' \| 'never'`	`'never'`	How often the Architect re-evaluates model recommendations.
`autoApplyRecommendations`	`boolean`	`false`	Automatically apply Architect recommendations without manual review.
`taskAnalysis`	`TaskAnalysis`	—	Read-only. The Architect’s current recommendation including primary model, alternatives, and reasoning.

Structured Output

Force responses into a consistent format. Native JSON schema support for OpenAI models; prompt-injected for other providers.

Field	Type	Default	Description
`responseFormatEnabled`	`boolean`	`false`	Enable structured output.
`responseFormatType`	`'text' \| 'json_object' \| 'json_schema'`	`'text'`	Output format. `json_object` guarantees valid JSON. `json_schema` validates against a strict schema.
`responseFormatSchema`	`object`	—	JSON schema for `json_schema` mode. Defines the exact structure of responses.

Spending Limits

Control costs at the gate level. See Spending for account-level controls.

Field	Type	Default	Description
`spendingLimit`	`number \| null`	`null`	Dollar cap per period. `null` = no limit.
`spendingLimitPeriod`	`'monthly' \| 'daily'`	`'monthly'`	How often the spending counter resets.
`spendingEnforcement`	`'alert_only' \| 'block'`	`'alert_only'`	`alert_only` warns but allows requests. `block` rejects requests when the limit is hit.

Agent gates have additional configuration fields for sessions, sub-gates, and orchestration. See the Agent Gates documentation for those fields.

Read-Only Fields

These fields are returned by the API but cannot be set directly.

Field	Type	Description
`id`	`string`	Gate UUID. This is what you use in API calls.
`userId`	`string`	Owner’s user ID.
`createdAt`	`Date`	When the gate was created.
`updatedAt`	`Date`	When the gate was last modified.
`spendingCurrent`	`number`	Current period spending (USD).
`spendingPeriodStart`	`string`	Start of the current spending period.
`spendingStatus`	`'active' \| 'suspended'`	Whether the gate is active or suspended due to spending limit.

Dashboard Tabs

When editing a gate in the dashboard, configuration is organized into tabs:

Basic Info

Name, description, task type, subtypes, and tags. Includes an Auto-enhance description toggle — when enabled, the Architect generates an improved version of your description optimized for smart routing analysis. If you accept the suggestion, it replaces your gate’s description field with the enhanced version.

Models & Routing

Primary model, fallback models, routing strategy, optimization weights, and smart routing (Architect) configuration.

Request Config

System prompt, temperature, max tokens, top P, and override permissions.

Spending Limits

Gate-level spending cap, period, and enforcement type.

Connections

Shows which agent gates use this gate as a sub-gate. If an agent gate references this gate for routing requests, it appears here with an AGENT badge and the agent gate’s name. You can Detach the gate from the agent gate directly from this tab. This is read-only context — you don’t configure connections here. Instead, sub-gate relationships are managed from the agent gate’s Sub-Gates tab. See Agent Gates for details on how sub-gates work.

Danger Zone

Delete the gate. This action is irreversible. If the gate is used as a sub-gate by any agent gates, it will be detached from them automatically.

Gate Limits

The number of gates you can create depends on your plan tier. Check your current usage at Dashboard → Settings.

Getting Started

SDK Reference

Platform

Integrations

Provider Compatibility

What is a Gate?

Gate Types

Task Types

Routing Strategies

Single (Default)

Fallback

Round-Robin

Creating a Gate

From the Dashboard

Via the API

Using a Gate

Gate Configuration Reference

Core Fields

Model Routing

Request Parameters

Smart Routing (Architect)

Structured Output

Spending Limits

Read-Only Fields

Dashboard Tabs

Basic Info

Models & Routing

Request Config

Spending Limits

Connections

Danger Zone

Gate Limits

Getting Started

SDK Reference

Platform

Integrations

Provider Compatibility

Documentation Index

​What is a Gate?

​Gate Types

​Task Types

​Routing Strategies

​Single (Default)

​Fallback

​Round-Robin

​Creating a Gate

​From the Dashboard

​Via the API

​Using a Gate

​Gate Configuration Reference

​Core Fields

​Model Routing

​Request Parameters

​Smart Routing (Architect)

​Structured Output

​Spending Limits

​Read-Only Fields

​Dashboard Tabs

​Basic Info

​Models & Routing

​Request Config

​Spending Limits

​Connections

​Danger Zone

​Gate Limits

What is a Gate?

Gate Types

Task Types

Routing Strategies

Single (Default)

Fallback

Round-Robin

Creating a Gate

From the Dashboard

Via the API

Using a Gate

Gate Configuration Reference

Core Fields

Model Routing

Request Parameters

Smart Routing (Architect)

Structured Output

Spending Limits

Read-Only Fields

Dashboard Tabs

Basic Info

Models & Routing

Request Config

Spending Limits

Connections

Danger Zone

Gate Limits