The chat() and chatStream() methods provide text generation with support for multimodal inputs, function calling, and streaming.
Basic Chat
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain quantum computing in simple terms' }
]
}
});
console.log(response.content);
console.log('Cost:', response.cost);
console.log('Model:', response.model);
Streaming
Stream responses token by token for better UX:
const stream = layer.chatStream({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'user', content: 'Write a poem about the ocean' }
]
}
});
for await (const chunk of stream) {
process.stdout.write(chunk.content || '');
}
console.log('\n');
Use streaming for long responses to show progress to users instead of making them wait.
Message Roles
System Messages
Set behavior and context:
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{
role: 'system',
content: 'You are a senior software engineer who explains concepts clearly and concisely.'
},
{
role: 'user',
content: 'What is dependency injection?'
}
]
}
});
Conversation History
Include previous messages for context:
const conversation = [
{ role: 'user', content: 'What is 2+2?' },
{ role: 'assistant', content: '2+2 equals 4.' },
{ role: 'user', content: 'What about 3+3?' }
];
const response = await layer.chat({
gateId: 'your-gate-id',
data: { messages: conversation }
});
Vision (Multimodal)
Send images in messages:
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'What's in this image?' },
{
type: 'image_url',
image_url: {
url: 'https://example.com/image.jpg',
detail: 'high' // 'auto', 'low', or 'high'
}
}
]
}
]
}
});
Image Detail Levels
| Level | Use Case | Tokens |
|---|
low | Simple identification | ~85 tokens |
high | Detailed analysis | ~765-2000 tokens |
auto | Model decides | Varies |
Multiple Images
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Compare these two images' },
{
type: 'image_url',
image_url: { url: 'https://example.com/image1.jpg' }
},
{
type: 'image_url',
image_url: { url: 'https://example.com/image2.jpg' }
}
]
}
]
}
});
Function Calling
Define tools the model can use:
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'user', content: 'What's the weather in San Francisco?' }
],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get the current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City and state, e.g. San Francisco, CA'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit'],
description: 'Temperature unit'
}
},
required: ['location']
}
}
}
],
toolChoice: 'auto'
}
});
// Check if model wants to call a function
if (response.toolCalls) {
for (const toolCall of response.toolCalls) {
console.log('Function:', toolCall.function.name);
console.log('Arguments:', toolCall.function.arguments);
// Execute the function
const args = JSON.parse(toolCall.function.arguments);
const result = await getWeather(args.location, args.unit);
// Send result back to model
// (add to conversation and make another request)
}
}
| Value | Behavior |
|---|
auto | Model decides whether to call functions |
required | Model must call at least one function |
none | Model cannot call functions |
{type: 'function', function: {name: 'func'}} | Force specific function |
Complete Function Calling Example
// Initial request
const response1 = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'user', content: 'What's the weather in Tokyo?' }
],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather for a location',
parameters: {
type: 'object',
properties: {
location: { type: 'string' },
unit: { type: 'string', enum: ['C', 'F'] }
},
required: ['location']
}
}
}
]
}
});
// Execute function
const toolCall = response1.toolCalls[0];
const args = JSON.parse(toolCall.function.arguments);
const weatherData = { temp: 22, condition: 'sunny' };
// Send result back
const response2 = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'user', content: 'What's the weather in Tokyo?' },
{
role: 'assistant',
content: null,
toolCalls: response1.toolCalls
},
{
role: 'tool',
toolCallId: toolCall.id,
content: JSON.stringify(weatherData)
}
]
}
});
console.log(response2.content);
// "It's currently 22°C and sunny in Tokyo."
JSON Mode
Force structured JSON output:
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{
role: 'user',
content: 'Extract name, email, and age from: John Doe, john@example.com, 30 years old'
}
],
responseFormat: { type: 'json_object' }
}
});
const data = JSON.parse(response.content);
console.log(data);
// { name: "John Doe", email: "john@example.com", age: 30 }
When using JSON mode, include “JSON” in your prompt to improve reliability.
Parameters
Request Parameters
| Parameter | Type | Description |
|---|
gateId | string | Gate UUID (required) |
gateName | string | Display name (optional) |
model | string | Override gate’s model (optional) |
metadata | object | Custom metadata (optional) |
Data Parameters
| Parameter | Type | Default | Description |
|---|
messages | Message[] | Required | Conversation history |
temperature | number | 1.0 | Randomness (0-2) |
maxTokens | number | Model max | Maximum response length |
topP | number | 1.0 | Nucleus sampling |
stop | string[] | None | Stop sequences |
tools | Tool[] | None | Available functions |
toolChoice | string | object | None | Tool selection strategy |
responseFormat | object | None | Force JSON output |
Temperature Guide
| Value | Use Case | Behavior |
|---|
| 0 | Code, math, facts | Deterministic, focused |
| 0.7 | General conversation | Balanced |
| 1.0 | Creative writing | More random |
| 1.5+ | Poetry, brainstorming | Very creative |
Response
interface ChatResponse {
id: string; // Request ID
model: string; // Model that generated response
content: string; // Response text
finishReason: string; // Why generation stopped
cost: number; // Request cost in USD
latency: number; // Response time in ms
toolCalls?: ToolCall[]; // Function calls (if any)
usage?: {
promptTokens: number;
completionTokens: number;
totalTokens: number;
};
}
Finish Reasons
| Reason | Meaning |
|---|
completed | Natural completion |
length_limit | Hit max_tokens |
tool_call | Model called a function |
filtered | Content filtered |
error | Request failed |
Advanced
Override Model
Override the gate’s configured model:
const response = await layer.chat({
gateId: 'your-gate-id',
model: 'gpt-4o', // Use specific model
data: {
messages: [{ role: 'user', content: 'Hello!' }]
}
});
Track requests with metadata:
const response = await layer.chat({
gateId: 'your-gate-id',
metadata: {
userId: 'user-123',
sessionId: 'session-456',
feature: 'chat-support'
},
data: {
messages: [{ role: 'user', content: 'Hello!' }]
}
});
Stop Sequences
Stop generation at specific strings:
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{ role: 'user', content: 'List 3 fruits' }
],
stop: ['\n4.', 'That's all']
}
});
Best Practices
1. Use Streaming for Long Responses
// Good ✅ - Better UX
for await (const chunk of layer.chatStream({...})) {
process.stdout.write(chunk.content || '');
}
// Less ideal - User waits
const response = await layer.chat({...});
console.log(response.content);
2. Always Handle Errors
// Good ✅
try {
const response = await layer.chat({...});
return response.content;
} catch (error) {
logger.error('Chat failed:', error);
return 'Sorry, I encountered an error.';
}
// Bad ❌
const response = await layer.chat({...}); // Unhandled errors
3. Provide System Context
// Good ✅ - Clear behavior
{
messages: [
{ role: 'system', content: 'You are a helpful math tutor for high school students.' },
{ role: 'user', content: 'Explain calculus' }
]
}
// Less effective
{
messages: [
{ role: 'user', content: 'Explain calculus' }
]
}
4. Monitor Costs
const response = await layer.chat({...});
console.log(`Cost: $${response.cost.toFixed(4)}`);
console.log(`Tokens: ${response.usage?.totalTokens}`);
Examples
Customer Support Bot
const response = await layer.chat({
gateId: 'your-gate-id',
metadata: { userId: 'user-123', type: 'support' },
data: {
messages: [
{
role: 'system',
content: 'You are a friendly customer support agent. Be helpful and concise.'
},
{
role: 'user',
content: 'How do I reset my password?'
}
],
temperature: 0.7,
maxTokens: 300
}
});
Code Explanation
const response = await layer.chat({
gateId: 'your-gate-id',
data: {
messages: [
{
role: 'system',
content: 'You are an expert programmer who explains code clearly.'
},
{
role: 'user',
content: 'Explain this code:\n```python\n[x**2 for x in range(10)]\n```'
}
],
temperature: 0.3 // Lower for more focused technical responses
}
});
Image Analysis
const response = await layer.chat({
gateId: 'your-vision-gate-id',
data: {
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Describe this image in detail' },
{
type: 'image_url',
image_url: {
url: 'https://example.com/photo.jpg',
detail: 'high'
}
}
]
}
]
}
});
Next Steps