The ocr() method extracts text, tables, and structured data from images and PDFs using optical character recognition.
Basic OCR
Extract text from an image URL:
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
imageUrl: 'https://example.com/document.jpg'
}
});
console.log('Extracted text:', response.ocr.pages[0].markdown);
console.log('Pages:', response.ocr.pages.length);
console.log('Cost:', response.cost);
Image URL
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
imageUrl: 'https://example.com/receipt.jpg'
}
});
Document URL (PDFs)
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/invoice.pdf'
}
});
// Multi-page PDF
response.ocr.pages.forEach((page, i) => {
console.log(`Page ${i + 1}:`, page.markdown);
});
Base64 Images
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
base64: 'data:image/png;base64,iVBORw0KGgoAAAANS...',
mimeType: 'image/png'
}
});
Extract tables in different formats:
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/financial-report.pdf',
tableFormat: 'markdown' // 'markdown' or 'html'
}
});
// Access tables
response.ocr.pages.forEach(page => {
if (page.tables) {
page.tables.forEach(table => {
console.log('Table (HTML):', table.html);
});
}
});
| Format | Use Case |
|---|
markdown | Simple display, documentation |
html | Web rendering, styling |
Use markdown format for quick display or LLM processing. Use html for web applications where you need custom styling.
Extract embedded images from documents:
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/presentation.pdf',
includeImageBase64: true
}
});
// Access embedded images
response.ocr.pages.forEach(page => {
if (page.images) {
page.images.forEach(img => {
console.log('Image ID:', img.id);
console.log('Image data:', img.base64);
console.log('Position:', img.topLeftX, img.topLeftY);
});
}
});
Extract headers and footers:
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/contract.pdf',
extractHeader: true,
extractFooter: true
}
});
response.ocr.pages.forEach(page => {
console.log('Header:', page.header);
console.log('Footer:', page.footer);
console.log('Main content:', page.markdown);
});
Parameters
Request Parameters
| Parameter | Type | Default | Description |
|---|
documentUrl | string | - | PDF document URL |
imageUrl | string | - | Image URL |
base64 | string | - | Base64-encoded image/document |
mimeType | string | - | MIME type for base64 |
tableFormat | 'markdown' | 'html' | markdown | Table output format |
includeImageBase64 | boolean | false | Extract embedded images |
extractHeader | boolean | false | Extract page headers |
extractFooter | boolean | false | Extract page footers |
Response
interface OCRResponse {
id: string; // Request ID
model: string; // Model used
ocr: OCROutput; // Extracted data
cost: number; // Request cost in USD
latency: number; // Response time in ms
}
interface OCROutput {
pages: OCRPage[]; // Extracted pages
model: string; // OCR model used
documentAnnotation?: object; // Document metadata
usageInfo?: {
pagesProcessed?: number;
docSizeBytes?: number;
};
}
interface OCRPage {
index: number; // Page number
markdown: string; // Extracted text
images?: Array<{
id: string;
base64?: string;
topLeftX?: number;
topLeftY?: number;
bottomRightX?: number;
bottomRightY?: number;
}>;
tables?: Array<{
id: string;
html?: string;
}>;
hyperlinks?: Array<{
text: string;
url: string;
}>;
header?: string | null;
footer?: string | null;
dimensions?: {
width: number;
height: number;
dpi?: number;
};
}
Best Practices
1. Use High-Quality Images
// Good ✅ - Clear, high-resolution scan
{
imageUrl: 'https://example.com/high-res-scan.jpg'
}
// Less effective - Low resolution, blurry
{
imageUrl: 'https://example.com/phone-photo.jpg'
}
// Good ✅ - PDF for multi-page documents
{
documentUrl: 'https://example.com/report.pdf'
}
// Good ✅ - Image URL for single pages
{
imageUrl: 'https://example.com/receipt.jpg'
}
// Good ✅ - Base64 for uploaded files
{
base64: uploadedImageData,
mimeType: 'image/jpeg'
}
// Good ✅ - Minimal extraction for better performance
{
documentUrl: 'url',
tableFormat: 'markdown'
}
// Less efficient - Extracting everything when not needed
{
documentUrl: 'url',
tableFormat: 'html',
includeImageBase64: true,
extractHeader: true,
extractFooter: true
}
4. Always Handle Errors
try {
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/doc.pdf'
}
});
return response.ocr.pages[0].markdown;
} catch (error) {
console.error('OCR failed:', error);
return null;
}
5. Process Pages Efficiently
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/large-doc.pdf'
}
});
// Process pages in parallel if needed
const processedPages = await Promise.all(
response.ocr.pages.map(async (page) => {
// Process each page
return processPage(page.markdown);
})
);
Examples
Invoice Processing
const response = await layer.ocr({
gateId: 'your-gate-id',
metadata: { type: 'invoice' },
data: {
documentUrl: 'https://example.com/invoice.pdf',
tableFormat: 'markdown',
extractHeader: true
}
});
// Extract invoice data
const invoiceText = response.ocr.pages[0].markdown;
// Use LLM to structure the data
const structured = await layer.chat({
gateId: 'your-chat-gate-id',
data: {
messages: [
{
role: 'system',
content: 'Extract invoice number, date, total, and line items as JSON'
},
{
role: 'user',
content: invoiceText
}
],
responseFormat: { type: 'json_object' }
}
});
const invoiceData = JSON.parse(structured.content);
console.log('Invoice #:', invoiceData.invoiceNumber);
console.log('Total:', invoiceData.total);
Receipt Scanning
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
imageUrl: 'https://example.com/receipt.jpg'
}
});
const receiptText = response.ocr.pages[0].markdown;
// Extract key information
const items = receiptText.match(/\$\d+\.\d{2}/g);
console.log('Prices found:', items);
Document Digitization
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/archive-doc.pdf',
includeImageBase64: true,
extractHeader: true,
extractFooter: true
}
});
// Save to database
for (const page of response.ocr.pages) {
await db.pages.create({
pageNumber: page.index,
content: page.markdown,
header: page.header,
footer: page.footer,
images: page.images
});
}
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
imageUrl: 'https://example.com/application-form.jpg'
}
});
const formText = response.ocr.pages[0].markdown;
// Extract form fields with LLM
const formData = await layer.chat({
gateId: 'your-chat-gate-id',
data: {
messages: [
{
role: 'system',
content: 'Extract form fields: name, email, phone, address as JSON'
},
{
role: 'user',
content: formText
}
],
responseFormat: { type: 'json_object' }
}
});
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/financial-report.pdf',
tableFormat: 'html'
}
});
// Extract all tables
const allTables = response.ocr.pages.flatMap(page =>
page.tables || []
);
console.log(`Found ${allTables.length} tables`);
// Process tables
allTables.forEach(table => {
// Parse HTML table and analyze
parseHTMLTable(table.html);
});
ID Card Scanning
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
imageUrl: 'https://example.com/drivers-license.jpg'
}
});
const idText = response.ocr.pages[0].markdown;
// Extract structured data
const idInfo = await layer.chat({
gateId: 'your-chat-gate-id',
data: {
messages: [
{
role: 'system',
content: 'Extract: name, license number, date of birth, expiration date as JSON'
},
{
role: 'user',
content: idText
}
],
responseFormat: { type: 'json_object' }
}
});
Advanced Usage
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/article.pdf'
}
});
response.ocr.pages.forEach(page => {
if (page.hyperlinks) {
console.log('Links found:');
page.hyperlinks.forEach(link => {
console.log(`${link.text} -> ${link.url}`);
});
}
});
Page Dimensions
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: 'https://example.com/doc.pdf'
}
});
response.ocr.pages.forEach(page => {
if (page.dimensions) {
console.log(`Page ${page.index + 1}:`);
console.log(` Size: ${page.dimensions.width}x${page.dimensions.height}`);
console.log(` DPI: ${page.dimensions.dpi}`);
}
});
Override Model
Use a specific OCR model:
const response = await layer.ocr({
gateId: 'your-gate-id',
model: 'textract-analyze',
data: {
documentUrl: 'https://example.com/complex-doc.pdf'
}
});
Track OCR processing:
const response = await layer.ocr({
gateId: 'your-gate-id',
metadata: {
userId: 'user-123',
documentType: 'invoice',
department: 'accounting'
},
data: {
documentUrl: 'https://example.com/invoice.pdf'
}
});
Use Cases
Document Management Systems
Digitize and index documents:
const documents = await getDocuments();
for (const doc of documents) {
const response = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: doc.url,
tableFormat: 'markdown'
}
});
// Index for search
await searchIndex.add({
id: doc.id,
content: response.ocr.pages.map(p => p.markdown).join('\n')
});
}
Expense Management
Automate expense report processing:
const receiptOCR = await layer.ocr({
gateId: 'your-gate-id',
data: { imageUrl: receiptUrl }
});
const expense = await layer.chat({
gateId: 'your-chat-gate-id',
data: {
messages: [{
role: 'user',
content: `Extract merchant, date, total, category from: ${receiptOCR.ocr.pages[0].markdown}`
}],
responseFormat: { type: 'json_object' }
}
});
Compliance & Legal
Extract and verify contract terms:
const contractOCR = await layer.ocr({
gateId: 'your-gate-id',
data: {
documentUrl: contractPdfUrl,
extractHeader: true,
extractFooter: true
}
});
// Search for specific clauses
const hasArbitrationClause = contractOCR.ocr.pages.some(page =>
page.markdown.toLowerCase().includes('arbitration')
);
Next Steps