Agent Gates - Layer AI

What are Agent Gates?

Agent gates extend standard gates with session tracking, tool visibility, and per-session spending controls. They’re designed for multi-step AI agent workflows where a single user intent triggers many model calls. Agent gates are not a separate system — they’re gates with additional configuration. The same proxy, SDK, and dashboard, just with session awareness layered on top.

When to Use Agent Gates

Scenario	Gate Type
Single LLM call (chatbot, summarization)	Standard gate
Multi-turn agent with tool calls	Agent gate
Agent with multiple models for different tasks	Agent gate (orchestrated)
Need session-level cost tracking	Agent gate

Key Concepts

Sessions

A session groups related LLM requests into one agent run. You generate a session ID with the SDK, and all requests sharing that ID are tracked together.

import { Layer } from '@layer-ai/sdk';

const layer = new Layer({ apiKey: process.env.LAYER_API_KEY });
const sessionId = layer.generateSessionId(); // UUIDv4, no API call

Sessions have a lifecycle:

active → idle → completed
              → runaway (requests after completed)
              → budget_exceeded (hard limit hit, blocks requests)

Active — Accepting requests. Default state when a session is created.
Idle — No requests received within the timeout window (default 30 min). Automatically reactivates to active if a new request arrives with the same session ID — idle is not terminal.
Completed — Developer called endSession(). If requests continue after completion, the session flips to runaway.
Runaway — Requests arrived after the session was marked completed. Layer allows the requests but flags the session. Useful for detecting agent loops or unexpected behavior.
Budget exceeded — Hard spending limit hit. New requests are rejected with HTTP 402. This is the only status that blocks requests.

Sessions are created implicitly — the first request with an unseen sessionId creates the session. No explicit session creation call is needed.

Modes

Agent gates operate in one of two modes:

Observability

Passthrough mode. All requests use the gate’s configured model. Layer tracks sessions, extracts tool calls, and provides analytics — but makes no routing decisions.

Orchestrated

Active routing mode. Requests route to different sub-gates based on the task. Currently supports static orchestration where the developer calls sub-gates directly.

Orchestration Type	Who Decides Routing
Static	Developer calls specific sub-gates at runtime. Layer tracks and groups everything under one session.
Dynamic	Layer picks the best sub-gate from a pool per request. (Coming soon)

Routing Behavior

How requests flow depends on the mode:

Observability

Request arrives at agent gate with sessionId
  → Layer logs the request to the session
  → Request routes through the gate's configured model
  → Response logged, session metrics updated

No routing decisions. Layer is a transparent proxy with session awareness.

Static Orchestrated

Request arrives at agent gate with sessionId
  → Treated as orchestrator reasoning
  → Routes through the agent gate's configured model

Request arrives at sub-gate with sessionId
  → Layer resolves parent agent gate via sub-gate relationship
  → Associates request with the parent's session
  → Routes through the sub-gate's own model, params, and fallbacks
  → Trace labels the request with the sub-gate's name

Both types of requests update the same session’s metrics (cost, tokens, latency, request count).

Sub-Gates

A sub-gate is a regular gate used as a routing target within an agent workflow. Any existing gate can serve as a sub-gate — it doesn’t know or care that it’s being used as one. Sub-gates let you use different models for different parts of your agent:

Orchestrator calls → expensive reasoning model (e.g., Claude Opus)
Data extraction → cheap fast model (e.g., Claude Haiku)
Code generation → code-specialized model (e.g., Codestral)

Session Spending Limits

Two-tier cost control per session:

Tier	Behavior	Default
Soft limit	Request proceeds. Layer adds `X-Layer-Session-Warning: soft_limit_exceeded` header so your code can react.	User-configured
Hard limit	Request rejected (HTTP 402). Session marked `budget_exceeded`.	2x soft limit

Integration

Layer SDK

import { Layer } from '@layer-ai/sdk';

const layer = new Layer({ apiKey: process.env.LAYER_API_KEY });
const sessionId = layer.generateSessionId();

// Make requests with the session ID
await layer.chat({
  gateId: 'your-agent-gate-uuid',
  sessionId,
  data: {
    messages: [{ role: 'user', content: 'Research climate change impacts' }]
  }
});

// End the session when done (optional — sessions idle out naturally)
await layer.endSession(sessionId);

Anthropic SDK (Drop-in)

Keep your existing Anthropic code. Just change the base URL, API key, and add two headers:

import Anthropic from '@anthropic-ai/sdk';
import { Layer } from '@layer-ai/sdk';

const layer = new Layer({ apiKey: process.env.LAYER_API_KEY });
const sessionId = layer.generateSessionId();

const anthropic = new Anthropic({
  baseURL: 'https://api.uselayer.ai',
  apiKey: process.env.LAYER_API_KEY,
  defaultHeaders: {
    'x-layer-gate-id': 'your-agent-gate-uuid',
    'x-layer-session-id': sessionId,
  },
});

// All existing code stays identical
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 4096,
  messages: [{ role: 'user', content: 'Research climate change impacts' }],
  tools: toolDefinitions,
});

OpenAI SDK (Drop-in)

import OpenAI from 'openai';
import { Layer } from '@layer-ai/sdk';

const layer = new Layer({ apiKey: process.env.LAYER_API_KEY });
const sessionId = layer.generateSessionId();

const openai = new OpenAI({
  baseURL: 'https://api.uselayer.ai/v1',
  apiKey: process.env.LAYER_API_KEY,
  defaultHeaders: {
    'x-layer-gate-id': 'your-agent-gate-uuid',
    'x-layer-session-id': sessionId,
  },
});

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Research climate change impacts' }],
  tools: toolDefinitions,
});

Multi-Gate Pattern (Static Orchestrated)

When your agent uses multiple models for different tasks, create separate SDK clients pointing at different sub-gates:

import Anthropic from '@anthropic-ai/sdk';
import { Layer } from '@layer-ai/sdk';

const layer = new Layer({ apiKey: process.env.LAYER_API_KEY });
const sessionId = layer.generateSessionId();

// Orchestrator — expensive model for reasoning
const orchestrator = new Anthropic({
  baseURL: 'https://api.uselayer.ai',
  apiKey: process.env.LAYER_API_KEY,
  defaultHeaders: {
    'x-layer-gate-id': 'your-agent-gate-uuid',
    'x-layer-session-id': sessionId,
  },
});

// Extractor — cheap model for data extraction
const extractor = new Anthropic({
  baseURL: 'https://api.uselayer.ai',
  apiKey: process.env.LAYER_API_KEY,
  defaultHeaders: {
    'x-layer-gate-id': 'your-extraction-sub-gate-uuid',
    'x-layer-session-id': sessionId, // same session ties it together
  },
});

// Main agent loop
const response = await orchestrator.messages.create({ ... });

// Inside a tool function
async function extractData(content: string) {
  return extractor.messages.create({
    model: 'claude-haiku-4-5-20251001',
    messages: [{ role: 'user', content: `Extract key facts: ${content}` }],
  });
}

Layer sees both calls under the same session. The orchestrator calls are labeled with the agent gate name; extraction calls are labeled with the sub-gate name.

Tool Visibility

Layer automatically extracts tool usage from request and response content — no additional configuration needed:

tool_use blocks in responses — what tools the LLM decided to call
tool_result blocks in requests — what the tools returned

This gives the dashboard a complete picture of your agent’s behavior, including tools that never touch an LLM (like database queries or API calls).

What Layer Does NOT Require

No code rewrite — Keep your existing Anthropic/OpenAI SDK code. Just change the base URL and add headers.
No action labels — Layer infers what each request is doing from which gate/sub-gate was called and from tool_use/tool_result blocks in the message content.
No parent/child wiring — Layer infers request relationships from timestamps, gate identity, and response content. You never manage a request tree.
No session creation call — Sessions are created implicitly on the first request with a new sessionId.
No session ID management — The SDK generates collision-safe UUIDs.
No session cleanup — Idle sessions reactivate automatically. You can call endSession() to explicitly mark a session as done, but it’s optional — sessions idle out naturally.

Dashboard

Session List

The agent gate dashboard shows all sessions with:

Status, duration, request count, total cost
Expandable timeline of requests within each session

Session Detail (Trace View)

Three view modes for inspecting a session:

Timeline — Chronological view of all requests with gate labels
By Action — Grouped by sub-gate/action with aggregate metrics
Trace — Tree-structured view showing orchestrator turns, tool calls, and sub-gate responses

Intelligence Insights

After a session completes, Layer’s intelligence agent automatically analyzes it for:

Behavioral patterns and loop detection
Quality scoring and anomaly detection
Cost efficiency with potential savings estimation
Actionable recommendations (model suggestions, prompt improvements)

Creating an Agent Gate

Go to Dashboard → Agent Gates → Create New
Configure the gate using the dashboard tabs described below.

Dashboard Tabs

When editing an agent gate in the dashboard, configuration is organized into tabs:

Basic Info

Name, description, mode (Observability or Orchestrated), orchestration type (Static or Dynamic for orchestrated mode), and tags.

Primary Model

Primary model selection, fallback chain, routing strategy (single/fallback/round-robin), optimization weights (cost/latency/quality), and smart routing via the Architect. Same configuration surface as standard gates.

Model Pool

Sub-gate management for orchestrated mode. Create new sub-gates inline (name + description + model recommendations) or attach existing gates. Each sub-gate shows its assigned model and a copyable gate ID. Advanced configuration opens the full gate settings for that sub-gate.

Session Settings

Per-session spending limits (soft + hard), session timeout duration, and alert configuration.

Request Settings

System prompt, temperature, max tokens, top P, and override permissions. Same as standard gates.

Sessions

Live session list with status filters, duration, request count, and total cost. Expand a session to see the request timeline. Includes aggregated intelligence insights across sessions (average quality, cost efficiency, loop rate, anomaly rate).

Intelligence

Gate-level intelligence overview with aggregated metrics and recommendations from Layer’s intelligence agent across all analyzed sessions.

Danger Zone

Delete the agent gate. This action is irreversible. Active sessions will no longer accept new requests.

Agent Gate Configuration Reference

Agent gates share all configuration fields from standard gates (model, fallbacks, routing strategy, spending limits, smart routing, etc. — see Gates). The fields below are specific to agent gates.

Agent Gate Fields

Field	Type	Required	Description
`gateType`	`'standard' \| 'agent'`	Yes	Must be `'agent'` for agent gates.
`mode`	`'observability' \| 'orchestrated'`	Yes	Operating mode. Observability is passthrough; orchestrated enables sub-gate routing.
`orchestrationType`	`'static' \| 'dynamic'`	No	Only for orchestrated mode. `static` means the developer calls sub-gates directly. `dynamic` means Layer selects from a pool (coming soon).
`subGates`	`string[]`	No	Array of sub-gate IDs for static orchestrated mode. Managed via the Sub-Gates tab in the dashboard.
`subGatePool`	`string[]`	No	Array of sub-gate IDs for dynamic orchestrated mode. Layer selects from this pool at runtime.

Session Settings

Field	Type	Default	Description
`sessionSpendingLimit`	`number \| null`	`null`	Soft spending limit per session (USD). When exceeded, requests proceed but Layer adds `X-Layer-Session-Warning: soft_limit_exceeded` to the response.
`sessionHardLimit`	`number \| null`	2x soft limit	Hard spending limit per session (USD). When exceeded, requests are rejected with HTTP 402 and the session is marked `budget_exceeded`.
`sessionTimeoutMinutes`	`number`	`30`	Minutes of inactivity before a session transitions from `active` to `idle`.

Session Read-Only Fields

These fields are returned on session objects but cannot be set directly.

Field	Type	Description
`id`	`string`	Session UUID (auto-generated).
`sessionId`	`string`	SDK-generated session identifier (the value from `generateSessionId()`).
`gateId`	`string`	The parent agent gate UUID.
`status`	`SessionStatus`	One of: `active`, `idle`, `completed`, `runaway`, `budget_exceeded`.
`mode`	`AgentGateMode`	Mode inherited from the gate at session creation time.
`totalRequests`	`number`	Running count of requests in this session.
`totalCost`	`number`	Cumulative cost in USD.
`totalTokens`	`number`	Cumulative token count.
`totalLatencyMs`	`number`	Cumulative latency in milliseconds.
`startedAt`	`string`	Timestamp of the first request.
`lastRequestAt`	`string`	Timestamp of the most recent request.
`completedAt`	`string \| null`	When the session ended (any terminal state).

Headers Reference

Header	Required	Description
`X-Layer-Gate-Id`	Yes	The agent gate or sub-gate UUID
`X-Layer-Session-Id`	Yes*	Session identifier from `generateSessionId()`. *Required for agent gates, ignored for standard gates.

Response headers:

Header	Description
`X-Layer-Session-Warning`	Set to `soft_limit_exceeded` when session cost exceeds the soft spending limit

Getting Started

SDK Reference

Platform

Integrations

Provider Compatibility

Documentation Index

​What are Agent Gates?

​When to Use Agent Gates

​Key Concepts

​Sessions

​Modes

​Observability

​Orchestrated

​Routing Behavior

​Observability

​Static Orchestrated

​Sub-Gates

​Session Spending Limits

​Integration

​Layer SDK

​Anthropic SDK (Drop-in)

​OpenAI SDK (Drop-in)

​Multi-Gate Pattern (Static Orchestrated)

​Tool Visibility

​What Layer Does NOT Require

​Dashboard

​Session List

​Session Detail (Trace View)

​Intelligence Insights

​Creating an Agent Gate

​Dashboard Tabs

​Basic Info

​Primary Model

​Model Pool

​Session Settings

​Request Settings

​Sessions

​Intelligence

​Danger Zone

​Agent Gate Configuration Reference

​Agent Gate Fields

​Session Settings

​Session Read-Only Fields

​Headers Reference

What are Agent Gates?

When to Use Agent Gates

Key Concepts

Sessions

Modes

Observability

Orchestrated

Routing Behavior

Observability

Static Orchestrated

Sub-Gates

Session Spending Limits

Integration

Layer SDK

Anthropic SDK (Drop-in)

OpenAI SDK (Drop-in)

Multi-Gate Pattern (Static Orchestrated)

Tool Visibility

What Layer Does NOT Require

Dashboard

Session List

Session Detail (Trace View)

Intelligence Insights

Creating an Agent Gate

Dashboard Tabs

Basic Info

Primary Model

Model Pool

Session Settings

Request Settings

Sessions

Intelligence

Danger Zone

Agent Gate Configuration Reference

Agent Gate Fields

Session Settings

Session Read-Only Fields

Headers Reference