The embeddings() method converts text into high-dimensional vectors that capture semantic meaning, enabling similarity search, clustering, and RAG (Retrieval-Augmented Generation).
Basic Embeddings
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: 'The quick brown fox jumps over the lazy dog'
}
});
console . log ( 'Embedding vector:' , response . embeddings [ 0 ]);
console . log ( 'Dimensions:' , response . embeddings [ 0 ]. length );
console . log ( 'Cost:' , response . cost );
Generate embeddings for multiple texts at once:
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: [
'What is machine learning?' ,
'How does AI work?' ,
'Explain neural networks'
]
}
});
// Each input gets an embedding
response . embeddings . forEach (( embedding , i ) => {
console . log ( `Text ${ i + 1 } embedding:` , embedding . slice ( 0 , 5 ), '...' );
});
Batch multiple texts together for better performance and lower costs.
Custom Dimensions
Control the embedding vector size:
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: 'Sample text' ,
dimensions: 512 // Smaller vectors = faster search
}
});
console . log ( 'Vector size:' , response . embeddings [ 0 ]. length );
// Output: Vector size: 512
Dimension Trade-offs
Dimensions Storage Speed Accuracy 256 Low Fast Good 512 Medium Medium Better 1024 Medium Medium Great 1536 High Slower Best 3072 Very High Slowest Excellent
Higher dimensions capture more nuance but require more storage and slower search. Start with 1024 for most use cases.
Choose output format:
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: 'Sample text' ,
encodingFormat: 'float' // 'float' or 'base64'
}
});
Format Use Case Storage floatDirect computation, most libraries Larger base64Network transfer, compression Smaller
Parameters
Request Parameters
Parameter Type Default Description inputstring | string[]Required Text(s) to embed dimensionsnumberModel default Output vector size encodingFormat'float' | 'base64'floatEncoding format
Response
interface EmbeddingsResponse {
id : string ; // Request ID
model : string ; // Model used
embeddings : number [][]; // Vector embeddings
cost : number ; // Request cost in USD
latency : number ; // Response time in ms
usage ?: {
promptTokens : number ;
totalTokens : number ;
};
}
Use Cases
Semantic Search
Find similar documents:
// 1. Embed all documents
const documents = [
'Machine learning is a subset of AI' ,
'Neural networks are inspired by the brain' ,
'Python is a programming language' ,
'Deep learning uses multiple layers'
];
const docEmbeddings = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: documents }
});
// 2. Embed user query
const queryResponse = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: 'What is AI?' }
});
const queryEmbedding = queryResponse . embeddings [ 0 ];
// 3. Calculate cosine similarity
function cosineSimilarity ( a : number [], b : number []) : number {
const dotProduct = a . reduce (( sum , val , i ) => sum + val * b [ i ], 0 );
const magA = Math . sqrt ( a . reduce (( sum , val ) => sum + val * val , 0 ));
const magB = Math . sqrt ( b . reduce (( sum , val ) => sum + val * val , 0 ));
return dotProduct / ( magA * magB );
}
// 4. Find most similar documents
const similarities = docEmbeddings . embeddings . map (( docEmb , i ) => ({
document: documents [ i ],
similarity: cosineSimilarity ( queryEmbedding , docEmb )
}));
similarities . sort (( a , b ) => b . similarity - a . similarity );
console . log ( 'Most relevant:' , similarities [ 0 ]. document );
// Output: "Machine learning is a subset of AI"
RAG (Retrieval-Augmented Generation)
Combine embeddings with chat for context-aware responses:
// 1. Embed knowledge base
const knowledgeBase = [
'Our store hours are 9am-5pm Monday-Friday' ,
'We offer free shipping on orders over $50' ,
'Returns are accepted within 30 days'
];
const kbEmbeddings = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: knowledgeBase }
});
// 2. User asks a question
const userQuestion = 'What are your shipping policies?' ;
const questionEmb = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: userQuestion }
});
// 3. Find relevant context
const relevantDocs = kbEmbeddings . embeddings
. map (( emb , i ) => ({
text: knowledgeBase [ i ],
similarity: cosineSimilarity ( questionEmb . embeddings [ 0 ], emb )
}))
. sort (( a , b ) => b . similarity - a . similarity )
. slice ( 0 , 2 )
. map ( d => d . text );
// 4. Answer with context
const answer = await layer . chat ({
gateId: 'your-chat-gate-id' ,
data: {
messages: [
{
role: 'system' ,
content: `Answer using this context: ${ relevantDocs . join ( '. ' ) } `
},
{
role: 'user' ,
content: userQuestion
}
]
}
});
console . log ( answer . content );
// "We offer free shipping on orders over $50."
Clustering
Group similar texts:
const texts = [
'I love this product!' ,
'Great quality, highly recommend' ,
'Terrible experience, very disappointed' ,
'Amazing service, will buy again' ,
'Worst purchase ever'
];
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: texts }
});
// Use embeddings with clustering algorithm (k-means, etc.)
// Positive reviews will cluster together, negative reviews together
Duplicate Detection
Find duplicate or near-duplicate content:
const newArticle = 'How to train a machine learning model' ;
const newEmbedding = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: newArticle }
});
// Compare against existing articles
const existingEmbeddings = [ ... ]; // Previously computed
const duplicates = existingEmbeddings . filter ( existing => {
const similarity = cosineSimilarity (
newEmbedding . embeddings [ 0 ],
existing . embedding
);
return similarity > 0.95 ; // Very similar
});
if ( duplicates . length > 0 ) {
console . log ( 'Potential duplicate found' );
}
Recommendation System
Recommend similar items:
// User liked this movie
const likedMovie = 'A sci-fi thriller about time travel' ;
const likedEmb = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: likedMovie }
});
// Find similar movies
const movieDatabase = [
'A space adventure with aliens' ,
'A romantic comedy about weddings' ,
'A time-bending mystery thriller' ,
'A cooking competition show'
];
const movieEmbs = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: movieDatabase }
});
const recommendations = movieEmbs . embeddings
. map (( emb , i ) => ({
movie: movieDatabase [ i ],
similarity: cosineSimilarity ( likedEmb . embeddings [ 0 ], emb )
}))
. sort (( a , b ) => b . similarity - a . similarity )
. slice ( 0 , 3 );
console . log ( 'You might like:' , recommendations [ 0 ]. movie );
// "A time-bending mystery thriller"
Best Practices
// Good ✅ - Batch multiple texts
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: [ 'text1' , 'text2' , 'text3' , 'text4' , 'text5' ]
}
});
// Less efficient - Multiple requests
for ( const text of texts ) {
await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: text }
});
}
2. Normalize Text First
function normalizeText ( text : string ) : string {
return text
. toLowerCase ()
. trim ()
. replace ( / \s + / g , ' ' );
}
// Good ✅ - Consistent embeddings
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: normalizeText ( ' Sample Text ' )
}
});
3. Cache Embeddings
const embeddingCache = new Map < string , number []>();
async function getEmbedding ( text : string ) : Promise < number []> {
if ( embeddingCache . has ( text )) {
return embeddingCache . get ( text ) ! ;
}
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: text }
});
const embedding = response . embeddings [ 0 ];
embeddingCache . set ( text , embedding );
return embedding ;
}
4. Choose Appropriate Dimensions
// Good ✅ - Lower dimensions for large-scale search
{
input : texts ,
dimensions : 512 // Faster search, less storage
}
// Good ✅ - Higher dimensions for precision
{
input : texts ,
dimensions : 1536 // More accurate similarity
}
5. Always Handle Errors
try {
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: 'Sample text' }
});
return response . embeddings [ 0 ];
} catch ( error ) {
console . error ( 'Embedding generation failed:' , error );
return null ;
}
Advanced Usage
Vector Database Integration
Store embeddings in a vector database:
import { Pinecone } from '@pinecone-database/pinecone' ;
const pinecone = new Pinecone ();
const index = pinecone . index ( 'my-index' );
// Generate embeddings
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: {
input: [ 'doc1 text' , 'doc2 text' , 'doc3 text' ]
}
});
// Store in Pinecone
await index . upsert (
response . embeddings . map (( embedding , i ) => ({
id: `doc- ${ i } ` ,
values: embedding ,
metadata: { text: `doc ${ i + 1 } text` }
}))
);
// Query
const queryEmb = await layer . embeddings ({
gateId: 'your-gate-id' ,
data: { input: 'search query' }
});
const results = await index . query ({
vector: queryEmb . embeddings [ 0 ],
topK: 5
});
Override Model
Use a specific embedding model:
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
model: 'text-embedding-3-large' ,
data: {
input: 'Sample text'
}
});
Track embedding generation:
const response = await layer . embeddings ({
gateId: 'your-gate-id' ,
metadata: {
userId: 'user-123' ,
feature: 'semantic-search' ,
collection: 'products'
},
data: {
input: productDescriptions
}
});
Next Steps
Chat Use embeddings with RAG
Gates & Routing How Layer routes embedding requests
Cost Tracking Monitor embedding costs