Audio - Layer AI

The tts() method converts text into spoken audio using AI voice models from OpenAI, ElevenLabs, and other providers.

Basic Text-to-Speech

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Hello! Welcome to Layer AI text-to-speech.'
  }
});

console.log('Audio URL:', response.audio.url);
console.log('Duration:', response.audio.duration, 'seconds');
console.log('Cost:', response.cost);

Voice Selection

Choose from different voices:

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'The weather today is sunny and warm.',
    voice: 'alloy'  // Options: alloy, echo, fable, onyx, nova, shimmer
  }
});

Available Voices

Voice	Characteristics
`alloy`	Neutral, balanced
`echo`	Clear, professional
`fable`	Warm, storytelling
`onyx`	Deep, authoritative
`nova`	Energetic, friendly
`shimmer`	Soft, calm

Try different voices to find the best match for your content. Use alloy or echo for professional content, fable for narratives, and nova for upbeat messages.

Audio Format

Control the output audio format:

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Sample text',
    responseFormat: 'mp3'  // mp3, opus, aac, flac, wav, pcm
  }
});

Format Options

Format	Quality	Size	Use Case
`mp3`	Good	Small	General use, streaming
`opus`	Great	Very small	Web, low bandwidth
`aac`	Good	Small	Mobile apps
`flac`	Excellent	Large	High quality, archival
`wav`	Perfect	Very large	Professional audio editing
`pcm`	Perfect	Largest	Raw audio processing

Speech Speed

Adjust playback speed:

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'This will be spoken at different speeds.',
    speed: 1.25  // 0.25 to 4.0
  }
});

Speed Guide

Speed	Use Case
`0.5`	Very slow, educational
`0.75`	Slow, clear pronunciation
`1.0`	Normal speed (default)
`1.25`	Slightly faster, energetic
`1.5`	Fast, time-saving
`2.0+`	Very fast, previews

Speeds below 0.5 or above 2.0 may sound unnatural. Stick to 0.75-1.5 for most use cases.

Parameters

Request Parameters

Parameter	Type	Default	Description
`input`	`string`	Required	Text to convert to speech
`voice`	`string`	Model default	Voice to use
`speed`	`number`	1.0	Playback speed (0.25-4.0)
`responseFormat`	`string`	`mp3`	Audio format

Response

interface AudioResponse {
  id: string;                    // Request ID
  model: string;                 // Model used
  audio: AudioOutput;            // Audio data
  cost: number;                  // Request cost in USD
  latency: number;               // Response time in ms
}

interface AudioOutput {
  url?: string;                  // Hosted audio URL
  base64?: string;               // Base64-encoded audio
  format?: string;               // Audio format
  duration?: number;             // Length in seconds
}

Best Practices

1. Format Text for Speech

// Good ✅ - Natural speech patterns
{
  input: 'Hello, and welcome to our podcast. Today we\'re discussing AI.'
}

// Less natural - Missing punctuation
{
  input: 'hello and welcome to our podcast today were discussing AI'
}

2. Use SSML for Control

Some models support SSML (Speech Synthesis Markup Language):

{
  input: '<speak>Hello! <break time="500ms"/> This is a pause.</speak>'
}

3. Choose Appropriate Format

// Good ✅ - MP3 for web playback
{
  input: 'Website announcement',
  responseFormat: 'mp3'
}

// Good ✅ - FLAC for podcast production
{
  input: 'Podcast episode narration',
  responseFormat: 'flac'
}

4. Always Handle Errors

try {
  const response = await layer.tts({
    gateId: 'your-gate-id',
    data: {
      input: 'Sample text'
    }
  });

  return response.audio.url;
} catch (error) {
  console.error('TTS generation failed:', error);
  return null;
}

5. Monitor Costs

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: longText
  }
});

console.log(`Duration: ${response.audio.duration}s`);
console.log(`Cost: $${response.cost.toFixed(6)}`);
console.log(`Characters: ${longText.length}`);

Examples

Podcast Intro

const response = await layer.tts({
  gateId: 'your-gate-id',
  metadata: { type: 'podcast-intro' },
  data: {
    input: 'Welcome to Tech Talks, the podcast where we explore the latest in technology and innovation.',
    voice: 'fable',
    responseFormat: 'mp3',
    speed: 1.0
  }
});

Product Announcement

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Introducing our new flagship product, designed to revolutionize your workflow!',
    voice: 'nova',
    speed: 1.1
  }
});

Educational Content

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'In this lesson, we will learn about photosynthesis, the process by which plants convert sunlight into energy.',
    voice: 'echo',
    speed: 0.9,  // Slower for learning
    responseFormat: 'mp3'
  }
});

Audiobook Narration

const chapter = `
  Chapter One: The Beginning.

  It was a dark and stormy night. The wind howled through the trees,
  and rain pounded against the windows of the old mansion.
`;

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: chapter,
    voice: 'fable',
    responseFormat: 'flac',  // High quality for production
    speed: 1.0
  }
});

Voice Assistant Response

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Your package will arrive tomorrow between 2 and 4 PM.',
    voice: 'alloy',
    responseFormat: 'opus',  // Small file for quick delivery
    speed: 1.0
  }
});

Multilingual Content

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Bonjour! Comment allez-vous?',
    voice: 'shimmer'
  }
});

Advanced Usage

Batch Generation

Generate multiple audio files:

const scripts = [
  'Welcome to the tour.',
  'This is the main lobby.',
  'Here is the conference room.',
  'Thank you for visiting!'
];

const audioFiles = await Promise.all(
  scripts.map(text =>
    layer.tts({
      gateId: 'your-gate-id',
      data: {
        input: text,
        voice: 'echo'
      }
    })
  )
);

audioFiles.forEach((response, i) => {
  console.log(`Audio ${i + 1}:`, response.audio.url);
});

Download Audio File

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Hello world',
    responseFormat: 'mp3'
  }
});

// Download to local file
const audioData = await fetch(response.audio.url);
const buffer = await audioData.arrayBuffer();
await fs.writeFile('output.mp3', Buffer.from(buffer));

Base64 Audio Embedding

const response = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Sample text',
    responseFormat: 'mp3'
  }
});

// Use base64 in HTML
const audioHTML = `
  <audio controls>
    <source src="${response.audio.base64}" type="audio/mpeg">
  </audio>
`;

Override Model

Use a specific TTS model:

const response = await layer.tts({
  gateId: 'your-gate-id',
  model: 'tts-1-hd',  // High-quality model
  data: {
    input: 'Premium audio content'
  }
});

Custom Metadata

Track TTS generation:

const response = await layer.tts({
  gateId: 'your-gate-id',
  metadata: {
    userId: 'user-123',
    feature: 'voiceover',
    language: 'en-US'
  },
  data: {
    input: 'Sample narration'
  }
});

Use Cases

E-Learning Platforms

Convert course content to audio for accessibility:

const lesson = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: courseText,
    voice: 'echo',
    speed: 0.9,
    responseFormat: 'mp3'
  }
});

IVR Systems

Generate prompts for phone systems:

const greeting = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: 'Thank you for calling. Press 1 for sales, press 2 for support.',
    voice: 'alloy',
    responseFormat: 'wav'
  }
});

Content Creation

Add voiceovers to videos:

const narration = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: videoScript,
    voice: 'fable',
    responseFormat: 'flac'
  }
});

Accessibility

Make written content accessible:

const audioVersion = await layer.tts({
  gateId: 'your-gate-id',
  data: {
    input: articleText,
    voice: 'echo',
    speed: 1.0,
    responseFormat: 'mp3'
  }
});

Next Steps

Video

Add audio to video content

Chat

Generate text for TTS

Gates & Routing

How Layer routes TTS requests

Cost Tracking

Monitor TTS costs

Getting Started

SDK Reference

Integration Guides

Dashboard

Core Concepts

​Basic Text-to-Speech

​Voice Selection

​Available Voices

​Audio Format

​Format Options

​Speech Speed

​Speed Guide

​Parameters

​Request Parameters

​Response

​Best Practices

​1. Format Text for Speech

​2. Use SSML for Control

​3. Choose Appropriate Format

​4. Always Handle Errors

​5. Monitor Costs

​Examples

​Podcast Intro

​Product Announcement

​Educational Content

​Audiobook Narration

​Voice Assistant Response

​Multilingual Content

​Advanced Usage

​Batch Generation

​Download Audio File

​Base64 Audio Embedding

​Override Model

​Custom Metadata

​Use Cases

​E-Learning Platforms

​IVR Systems

​Content Creation

​Accessibility

​Next Steps

Video

Chat

Gates & Routing

Cost Tracking

Basic Text-to-Speech

Voice Selection

Available Voices

Audio Format

Format Options

Speech Speed

Speed Guide

Parameters

Request Parameters

Response

Best Practices

1. Format Text for Speech

2. Use SSML for Control

3. Choose Appropriate Format

4. Always Handle Errors

5. Monitor Costs

Examples

Podcast Intro

Product Announcement

Educational Content

Audiobook Narration

Voice Assistant Response

Multilingual Content

Advanced Usage

Batch Generation

Download Audio File

Base64 Audio Embedding

Override Model

Custom Metadata

Use Cases

E-Learning Platforms

IVR Systems

Content Creation

Accessibility

Next Steps