Skip to main content
The embeddings() method converts text into high-dimensional vectors that capture semantic meaning, enabling similarity search, clustering, and RAG (Retrieval-Augmented Generation).

Basic Embeddings

const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: 'The quick brown fox jumps over the lazy dog'
  }
});

console.log('Embedding vector:', response.embeddings[0]);
console.log('Dimensions:', response.embeddings[0].length);
console.log('Cost:', response.cost);

Multiple Inputs

Generate embeddings for multiple texts at once:
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: [
      'What is machine learning?',
      'How does AI work?',
      'Explain neural networks'
    ]
  }
});

// Each input gets an embedding
response.embeddings.forEach((embedding, i) => {
  console.log(`Text ${i + 1} embedding:`, embedding.slice(0, 5), '...');
});
Batch multiple texts together for better performance and lower costs.

Custom Dimensions

Control the embedding vector size:
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: 'Sample text',
    dimensions: 512  // Smaller vectors = faster search
  }
});

console.log('Vector size:', response.embeddings[0].length);
// Output: Vector size: 512

Dimension Trade-offs

DimensionsStorageSpeedAccuracy
256LowFastGood
512MediumMediumBetter
1024MediumMediumGreat
1536HighSlowerBest
3072Very HighSlowestExcellent
Higher dimensions capture more nuance but require more storage and slower search. Start with 1024 for most use cases.

Encoding Format

Choose output format:
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: 'Sample text',
    encodingFormat: 'float'  // 'float' or 'base64'
  }
});

Format Options

FormatUse CaseStorage
floatDirect computation, most librariesLarger
base64Network transfer, compressionSmaller

Parameters

Request Parameters

ParameterTypeDefaultDescription
inputstring | string[]RequiredText(s) to embed
dimensionsnumberModel defaultOutput vector size
encodingFormat'float' | 'base64'floatEncoding format

Response

interface EmbeddingsResponse {
  id: string;                    // Request ID
  model: string;                 // Model used
  embeddings: number[][];        // Vector embeddings
  cost: number;                  // Request cost in USD
  latency: number;               // Response time in ms
  usage?: {
    promptTokens: number;
    totalTokens: number;
  };
}

Use Cases

Find similar documents:
// 1. Embed all documents
const documents = [
  'Machine learning is a subset of AI',
  'Neural networks are inspired by the brain',
  'Python is a programming language',
  'Deep learning uses multiple layers'
];

const docEmbeddings = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: documents }
});

// 2. Embed user query
const queryResponse = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: 'What is AI?' }
});

const queryEmbedding = queryResponse.embeddings[0];

// 3. Calculate cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magA * magB);
}

// 4. Find most similar documents
const similarities = docEmbeddings.embeddings.map((docEmb, i) => ({
  document: documents[i],
  similarity: cosineSimilarity(queryEmbedding, docEmb)
}));

similarities.sort((a, b) => b.similarity - a.similarity);

console.log('Most relevant:', similarities[0].document);
// Output: "Machine learning is a subset of AI"

RAG (Retrieval-Augmented Generation)

Combine embeddings with chat for context-aware responses:
// 1. Embed knowledge base
const knowledgeBase = [
  'Our store hours are 9am-5pm Monday-Friday',
  'We offer free shipping on orders over $50',
  'Returns are accepted within 30 days'
];

const kbEmbeddings = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: knowledgeBase }
});

// 2. User asks a question
const userQuestion = 'What are your shipping policies?';

const questionEmb = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: userQuestion }
});

// 3. Find relevant context
const relevantDocs = kbEmbeddings.embeddings
  .map((emb, i) => ({
    text: knowledgeBase[i],
    similarity: cosineSimilarity(questionEmb.embeddings[0], emb)
  }))
  .sort((a, b) => b.similarity - a.similarity)
  .slice(0, 2)
  .map(d => d.text);

// 4. Answer with context
const answer = await layer.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: `Answer using this context: ${relevantDocs.join('. ')}`
      },
      {
        role: 'user',
        content: userQuestion
      }
    ]
  }
});

console.log(answer.content);
// "We offer free shipping on orders over $50."

Clustering

Group similar texts:
const texts = [
  'I love this product!',
  'Great quality, highly recommend',
  'Terrible experience, very disappointed',
  'Amazing service, will buy again',
  'Worst purchase ever'
];

const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: texts }
});

// Use embeddings with clustering algorithm (k-means, etc.)
// Positive reviews will cluster together, negative reviews together

Duplicate Detection

Find duplicate or near-duplicate content:
const newArticle = 'How to train a machine learning model';

const newEmbedding = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: newArticle }
});

// Compare against existing articles
const existingEmbeddings = [...]; // Previously computed

const duplicates = existingEmbeddings.filter(existing => {
  const similarity = cosineSimilarity(
    newEmbedding.embeddings[0],
    existing.embedding
  );
  return similarity > 0.95; // Very similar
});

if (duplicates.length > 0) {
  console.log('Potential duplicate found');
}

Recommendation System

Recommend similar items:
// User liked this movie
const likedMovie = 'A sci-fi thriller about time travel';

const likedEmb = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: likedMovie }
});

// Find similar movies
const movieDatabase = [
  'A space adventure with aliens',
  'A romantic comedy about weddings',
  'A time-bending mystery thriller',
  'A cooking competition show'
];

const movieEmbs = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: movieDatabase }
});

const recommendations = movieEmbs.embeddings
  .map((emb, i) => ({
    movie: movieDatabase[i],
    similarity: cosineSimilarity(likedEmb.embeddings[0], emb)
  }))
  .sort((a, b) => b.similarity - a.similarity)
  .slice(0, 3);

console.log('You might like:', recommendations[0].movie);
// "A time-bending mystery thriller"

Best Practices

1. Batch Inputs for Efficiency

// Good ✅ - Batch multiple texts
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: ['text1', 'text2', 'text3', 'text4', 'text5']
  }
});

// Less efficient - Multiple requests
for (const text of texts) {
  await layer.embeddings({
    gateId: 'your-gate-id',
    data: { input: text }
  });
}

2. Normalize Text First

function normalizeText(text: string): string {
  return text
    .toLowerCase()
    .trim()
    .replace(/\s+/g, ' ');
}

// Good ✅ - Consistent embeddings
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: normalizeText('  Sample  Text  ')
  }
});

3. Cache Embeddings

const embeddingCache = new Map<string, number[]>();

async function getEmbedding(text: string): Promise<number[]> {
  if (embeddingCache.has(text)) {
    return embeddingCache.get(text)!;
  }

  const response = await layer.embeddings({
    gateId: 'your-gate-id',
    data: { input: text }
  });

  const embedding = response.embeddings[0];
  embeddingCache.set(text, embedding);
  return embedding;
}

4. Choose Appropriate Dimensions

// Good ✅ - Lower dimensions for large-scale search
{
  input: texts,
  dimensions: 512  // Faster search, less storage
}

// Good ✅ - Higher dimensions for precision
{
  input: texts,
  dimensions: 1536  // More accurate similarity
}

5. Always Handle Errors

try {
  const response = await layer.embeddings({
    gateId: 'your-gate-id',
    data: { input: 'Sample text' }
  });

  return response.embeddings[0];
} catch (error) {
  console.error('Embedding generation failed:', error);
  return null;
}

Advanced Usage

Vector Database Integration

Store embeddings in a vector database:
import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone();
const index = pinecone.index('my-index');

// Generate embeddings
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  data: {
    input: ['doc1 text', 'doc2 text', 'doc3 text']
  }
});

// Store in Pinecone
await index.upsert(
  response.embeddings.map((embedding, i) => ({
    id: `doc-${i}`,
    values: embedding,
    metadata: { text: `doc${i + 1} text` }
  }))
);

// Query
const queryEmb = await layer.embeddings({
  gateId: 'your-gate-id',
  data: { input: 'search query' }
});

const results = await index.query({
  vector: queryEmb.embeddings[0],
  topK: 5
});

Override Model

Use a specific embedding model:
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  model: 'text-embedding-3-large',
  data: {
    input: 'Sample text'
  }
});

Custom Metadata

Track embedding generation:
const response = await layer.embeddings({
  gateId: 'your-gate-id',
  metadata: {
    userId: 'user-123',
    feature: 'semantic-search',
    collection: 'products'
  },
  data: {
    input: productDescriptions
  }
});

Next Steps