The embeddings() method converts text into high-dimensional vectors that capture semantic meaning, enabling similarity search, clustering, and RAG (Retrieval-Augmented Generation).
Basic Embeddings
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: 'The quick brown fox jumps over the lazy dog'
}
});
console.log('Embedding vector:', response.embeddings[0]);
console.log('Dimensions:', response.embeddings[0].length);
console.log('Cost:', response.cost);
Generate embeddings for multiple texts at once:
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: [
'What is machine learning?',
'How does AI work?',
'Explain neural networks'
]
}
});
// Each input gets an embedding
response.embeddings.forEach((embedding, i) => {
console.log(`Text ${i + 1} embedding:`, embedding.slice(0, 5), '...');
});
Batch multiple texts together for better performance and lower costs.
Custom Dimensions
Control the embedding vector size:
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: 'Sample text',
dimensions: 512 // Smaller vectors = faster search
}
});
console.log('Vector size:', response.embeddings[0].length);
// Output: Vector size: 512
Dimension Trade-offs
| Dimensions | Storage | Speed | Accuracy |
|---|
| 256 | Low | Fast | Good |
| 512 | Medium | Medium | Better |
| 1024 | Medium | Medium | Great |
| 1536 | High | Slower | Best |
| 3072 | Very High | Slowest | Excellent |
Higher dimensions capture more nuance but require more storage and slower search. Start with 1024 for most use cases.
Choose output format:
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: 'Sample text',
encodingFormat: 'float' // 'float' or 'base64'
}
});
| Format | Use Case | Storage |
|---|
float | Direct computation, most libraries | Larger |
base64 | Network transfer, compression | Smaller |
Parameters
Request Parameters
| Parameter | Type | Default | Description |
|---|
input | string | string[] | Required | Text(s) to embed |
dimensions | number | Model default | Output vector size |
encodingFormat | 'float' | 'base64' | float | Encoding format |
Response
interface EmbeddingsResponse {
id: string; // Request ID
model: string; // Model used
embeddings: number[][]; // Vector embeddings
cost: number; // Request cost in USD
latency: number; // Response time in ms
usage?: {
promptTokens: number;
totalTokens: number;
};
}
Use Cases
Semantic Search
Find similar documents:
// 1. Embed all documents
const documents = [
'Machine learning is a subset of AI',
'Neural networks are inspired by the brain',
'Python is a programming language',
'Deep learning uses multiple layers'
];
const docEmbeddings = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: documents }
});
// 2. Embed user query
const queryResponse = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: 'What is AI?' }
});
const queryEmbedding = queryResponse.embeddings[0];
// 3. Calculate cosine similarity
function cosineSimilarity(a: number[], b: number[]): number {
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
return dotProduct / (magA * magB);
}
// 4. Find most similar documents
const similarities = docEmbeddings.embeddings.map((docEmb, i) => ({
document: documents[i],
similarity: cosineSimilarity(queryEmbedding, docEmb)
}));
similarities.sort((a, b) => b.similarity - a.similarity);
console.log('Most relevant:', similarities[0].document);
// Output: "Machine learning is a subset of AI"
RAG (Retrieval-Augmented Generation)
Combine embeddings with chat for context-aware responses:
// 1. Embed knowledge base
const knowledgeBase = [
'Our store hours are 9am-5pm Monday-Friday',
'We offer free shipping on orders over $50',
'Returns are accepted within 30 days'
];
const kbEmbeddings = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: knowledgeBase }
});
// 2. User asks a question
const userQuestion = 'What are your shipping policies?';
const questionEmb = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: userQuestion }
});
// 3. Find relevant context
const relevantDocs = kbEmbeddings.embeddings
.map((emb, i) => ({
text: knowledgeBase[i],
similarity: cosineSimilarity(questionEmb.embeddings[0], emb)
}))
.sort((a, b) => b.similarity - a.similarity)
.slice(0, 2)
.map(d => d.text);
// 4. Answer with context
const answer = await layer.chat({
gateId: 'your-chat-gate-id',
data: {
messages: [
{
role: 'system',
content: `Answer using this context: ${relevantDocs.join('. ')}`
},
{
role: 'user',
content: userQuestion
}
]
}
});
console.log(answer.content);
// "We offer free shipping on orders over $50."
Clustering
Group similar texts:
const texts = [
'I love this product!',
'Great quality, highly recommend',
'Terrible experience, very disappointed',
'Amazing service, will buy again',
'Worst purchase ever'
];
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: texts }
});
// Use embeddings with clustering algorithm (k-means, etc.)
// Positive reviews will cluster together, negative reviews together
Duplicate Detection
Find duplicate or near-duplicate content:
const newArticle = 'How to train a machine learning model';
const newEmbedding = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: newArticle }
});
// Compare against existing articles
const existingEmbeddings = [...]; // Previously computed
const duplicates = existingEmbeddings.filter(existing => {
const similarity = cosineSimilarity(
newEmbedding.embeddings[0],
existing.embedding
);
return similarity > 0.95; // Very similar
});
if (duplicates.length > 0) {
console.log('Potential duplicate found');
}
Recommendation System
Recommend similar items:
// User liked this movie
const likedMovie = 'A sci-fi thriller about time travel';
const likedEmb = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: likedMovie }
});
// Find similar movies
const movieDatabase = [
'A space adventure with aliens',
'A romantic comedy about weddings',
'A time-bending mystery thriller',
'A cooking competition show'
];
const movieEmbs = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: movieDatabase }
});
const recommendations = movieEmbs.embeddings
.map((emb, i) => ({
movie: movieDatabase[i],
similarity: cosineSimilarity(likedEmb.embeddings[0], emb)
}))
.sort((a, b) => b.similarity - a.similarity)
.slice(0, 3);
console.log('You might like:', recommendations[0].movie);
// "A time-bending mystery thriller"
Best Practices
// Good ✅ - Batch multiple texts
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: ['text1', 'text2', 'text3', 'text4', 'text5']
}
});
// Less efficient - Multiple requests
for (const text of texts) {
await layer.embeddings({
gateId: 'your-gate-id',
data: { input: text }
});
}
2. Normalize Text First
function normalizeText(text: string): string {
return text
.toLowerCase()
.trim()
.replace(/\s+/g, ' ');
}
// Good ✅ - Consistent embeddings
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: normalizeText(' Sample Text ')
}
});
3. Cache Embeddings
const embeddingCache = new Map<string, number[]>();
async function getEmbedding(text: string): Promise<number[]> {
if (embeddingCache.has(text)) {
return embeddingCache.get(text)!;
}
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: text }
});
const embedding = response.embeddings[0];
embeddingCache.set(text, embedding);
return embedding;
}
4. Choose Appropriate Dimensions
// Good ✅ - Lower dimensions for large-scale search
{
input: texts,
dimensions: 512 // Faster search, less storage
}
// Good ✅ - Higher dimensions for precision
{
input: texts,
dimensions: 1536 // More accurate similarity
}
5. Always Handle Errors
try {
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: 'Sample text' }
});
return response.embeddings[0];
} catch (error) {
console.error('Embedding generation failed:', error);
return null;
}
Advanced Usage
Vector Database Integration
Store embeddings in a vector database:
import { Pinecone } from '@pinecone-database/pinecone';
const pinecone = new Pinecone();
const index = pinecone.index('my-index');
// Generate embeddings
const response = await layer.embeddings({
gateId: 'your-gate-id',
data: {
input: ['doc1 text', 'doc2 text', 'doc3 text']
}
});
// Store in Pinecone
await index.upsert(
response.embeddings.map((embedding, i) => ({
id: `doc-${i}`,
values: embedding,
metadata: { text: `doc${i + 1} text` }
}))
);
// Query
const queryEmb = await layer.embeddings({
gateId: 'your-gate-id',
data: { input: 'search query' }
});
const results = await index.query({
vector: queryEmb.embeddings[0],
topK: 5
});
Override Model
Use a specific embedding model:
const response = await layer.embeddings({
gateId: 'your-gate-id',
model: 'text-embedding-3-large',
data: {
input: 'Sample text'
}
});
Track embedding generation:
const response = await layer.embeddings({
gateId: 'your-gate-id',
metadata: {
userId: 'user-123',
feature: 'semantic-search',
collection: 'products'
},
data: {
input: productDescriptions
}
});
Next Steps