Skip to main content
The ocr() method extracts text, tables, and structured data from images and PDFs using optical character recognition.

Basic OCR

Extract text from an image URL:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/document.jpg'
  }
});

console.log('Extracted text:', response.ocr.pages[0].markdown);
console.log('Pages:', response.ocr.pages.length);
console.log('Cost:', response.cost);

Input Methods

Image URL

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/receipt.jpg'
  }
});

Document URL (PDFs)

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/invoice.pdf'
  }
});

// Multi-page PDF
response.ocr.pages.forEach((page, i) => {
  console.log(`Page ${i + 1}:`, page.markdown);
});

Base64 Images

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    base64: 'data:image/png;base64,iVBORw0KGgoAAAANS...',
    mimeType: 'image/png'
  }
});

Table Extraction

Extract tables in different formats:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/financial-report.pdf',
    tableFormat: 'markdown'  // 'markdown' or 'html'
  }
});

// Access tables
response.ocr.pages.forEach(page => {
  if (page.tables) {
    page.tables.forEach(table => {
      console.log('Table (HTML):', table.html);
    });
  }
});

Table Format Options

FormatUse Case
markdownSimple display, documentation
htmlWeb rendering, styling
Use markdown format for quick display or LLM processing. Use html for web applications where you need custom styling.

Image Extraction

Extract embedded images from documents:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/presentation.pdf',
    includeImageBase64: true
  }
});

// Access embedded images
response.ocr.pages.forEach(page => {
  if (page.images) {
    page.images.forEach(img => {
      console.log('Image ID:', img.id);
      console.log('Image data:', img.base64);
      console.log('Position:', img.topLeftX, img.topLeftY);
    });
  }
});

Headers and Footers

Extract headers and footers:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/contract.pdf',
    extractHeader: true,
    extractFooter: true
  }
});

response.ocr.pages.forEach(page => {
  console.log('Header:', page.header);
  console.log('Footer:', page.footer);
  console.log('Main content:', page.markdown);
});

Parameters

Request Parameters

ParameterTypeDefaultDescription
documentUrlstring-PDF document URL
imageUrlstring-Image URL
base64string-Base64-encoded image/document
mimeTypestring-MIME type for base64
tableFormat'markdown' | 'html'markdownTable output format
includeImageBase64booleanfalseExtract embedded images
extractHeaderbooleanfalseExtract page headers
extractFooterbooleanfalseExtract page footers

Response

interface OCRResponse {
  id: string;                    // Request ID
  model: string;                 // Model used
  ocr: OCROutput;                // Extracted data
  cost: number;                  // Request cost in USD
  latency: number;               // Response time in ms
}

interface OCROutput {
  pages: OCRPage[];              // Extracted pages
  model: string;                 // OCR model used
  documentAnnotation?: object;   // Document metadata
  usageInfo?: {
    pagesProcessed?: number;
    docSizeBytes?: number;
  };
}

interface OCRPage {
  index: number;                 // Page number
  markdown: string;              // Extracted text
  images?: Array<{
    id: string;
    base64?: string;
    topLeftX?: number;
    topLeftY?: number;
    bottomRightX?: number;
    bottomRightY?: number;
  }>;
  tables?: Array<{
    id: string;
    html?: string;
  }>;
  hyperlinks?: Array<{
    text: string;
    url: string;
  }>;
  header?: string | null;
  footer?: string | null;
  dimensions?: {
    width: number;
    height: number;
    dpi?: number;
  };
}

Best Practices

1. Use High-Quality Images

// Good ✅ - Clear, high-resolution scan
{
  imageUrl: 'https://example.com/high-res-scan.jpg'
}

// Less effective - Low resolution, blurry
{
  imageUrl: 'https://example.com/phone-photo.jpg'
}

2. Choose Appropriate Input Method

// Good ✅ - PDF for multi-page documents
{
  documentUrl: 'https://example.com/report.pdf'
}

// Good ✅ - Image URL for single pages
{
  imageUrl: 'https://example.com/receipt.jpg'
}

// Good ✅ - Base64 for uploaded files
{
  base64: uploadedImageData,
  mimeType: 'image/jpeg'
}

3. Extract Only What You Need

// Good ✅ - Minimal extraction for better performance
{
  documentUrl: 'url',
  tableFormat: 'markdown'
}

// Less efficient - Extracting everything when not needed
{
  documentUrl: 'url',
  tableFormat: 'html',
  includeImageBase64: true,
  extractHeader: true,
  extractFooter: true
}

4. Always Handle Errors

try {
  const response = await layer.ocr({
    gateId: 'your-gate-id',
    data: {
      documentUrl: 'https://example.com/doc.pdf'
    }
  });

  return response.ocr.pages[0].markdown;
} catch (error) {
  console.error('OCR failed:', error);
  return null;
}

5. Process Pages Efficiently

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/large-doc.pdf'
  }
});

// Process pages in parallel if needed
const processedPages = await Promise.all(
  response.ocr.pages.map(async (page) => {
    // Process each page
    return processPage(page.markdown);
  })
);

Examples

Invoice Processing

const response = await layer.ocr({
  gateId: 'your-gate-id',
  metadata: { type: 'invoice' },
  data: {
    documentUrl: 'https://example.com/invoice.pdf',
    tableFormat: 'markdown',
    extractHeader: true
  }
});

// Extract invoice data
const invoiceText = response.ocr.pages[0].markdown;

// Use LLM to structure the data
const structured = await layer.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract invoice number, date, total, and line items as JSON'
      },
      {
        role: 'user',
        content: invoiceText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

const invoiceData = JSON.parse(structured.content);
console.log('Invoice #:', invoiceData.invoiceNumber);
console.log('Total:', invoiceData.total);

Receipt Scanning

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/receipt.jpg'
  }
});

const receiptText = response.ocr.pages[0].markdown;

// Extract key information
const items = receiptText.match(/\$\d+\.\d{2}/g);
console.log('Prices found:', items);

Document Digitization

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/archive-doc.pdf',
    includeImageBase64: true,
    extractHeader: true,
    extractFooter: true
  }
});

// Save to database
for (const page of response.ocr.pages) {
  await db.pages.create({
    pageNumber: page.index,
    content: page.markdown,
    header: page.header,
    footer: page.footer,
    images: page.images
  });
}

Form Processing

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/application-form.jpg'
  }
});

const formText = response.ocr.pages[0].markdown;

// Extract form fields with LLM
const formData = await layer.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract form fields: name, email, phone, address as JSON'
      },
      {
        role: 'user',
        content: formText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

Table Extraction for Analysis

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/financial-report.pdf',
    tableFormat: 'html'
  }
});

// Extract all tables
const allTables = response.ocr.pages.flatMap(page =>
  page.tables || []
);

console.log(`Found ${allTables.length} tables`);

// Process tables
allTables.forEach(table => {
  // Parse HTML table and analyze
  parseHTMLTable(table.html);
});

ID Card Scanning

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    imageUrl: 'https://example.com/drivers-license.jpg'
  }
});

const idText = response.ocr.pages[0].markdown;

// Extract structured data
const idInfo = await layer.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'Extract: name, license number, date of birth, expiration date as JSON'
      },
      {
        role: 'user',
        content: idText
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

Advanced Usage

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/article.pdf'
  }
});

response.ocr.pages.forEach(page => {
  if (page.hyperlinks) {
    console.log('Links found:');
    page.hyperlinks.forEach(link => {
      console.log(`${link.text} -> ${link.url}`);
    });
  }
});

Page Dimensions

const response = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: 'https://example.com/doc.pdf'
  }
});

response.ocr.pages.forEach(page => {
  if (page.dimensions) {
    console.log(`Page ${page.index + 1}:`);
    console.log(`  Size: ${page.dimensions.width}x${page.dimensions.height}`);
    console.log(`  DPI: ${page.dimensions.dpi}`);
  }
});

Override Model

Use a specific OCR model:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  model: 'textract-analyze',
  data: {
    documentUrl: 'https://example.com/complex-doc.pdf'
  }
});

Custom Metadata

Track OCR processing:
const response = await layer.ocr({
  gateId: 'your-gate-id',
  metadata: {
    userId: 'user-123',
    documentType: 'invoice',
    department: 'accounting'
  },
  data: {
    documentUrl: 'https://example.com/invoice.pdf'
  }
});

Use Cases

Document Management Systems

Digitize and index documents:
const documents = await getDocuments();

for (const doc of documents) {
  const response = await layer.ocr({
    gateId: 'your-gate-id',
    data: {
      documentUrl: doc.url,
      tableFormat: 'markdown'
    }
  });

  // Index for search
  await searchIndex.add({
    id: doc.id,
    content: response.ocr.pages.map(p => p.markdown).join('\n')
  });
}

Expense Management

Automate expense report processing:
const receiptOCR = await layer.ocr({
  gateId: 'your-gate-id',
  data: { imageUrl: receiptUrl }
});

const expense = await layer.chat({
  gateId: 'your-chat-gate-id',
  data: {
    messages: [{
      role: 'user',
      content: `Extract merchant, date, total, category from: ${receiptOCR.ocr.pages[0].markdown}`
    }],
    responseFormat: { type: 'json_object' }
  }
});
Extract and verify contract terms:
const contractOCR = await layer.ocr({
  gateId: 'your-gate-id',
  data: {
    documentUrl: contractPdfUrl,
    extractHeader: true,
    extractFooter: true
  }
});

// Search for specific clauses
const hasArbitrationClause = contractOCR.ocr.pages.some(page =>
  page.markdown.toLowerCase().includes('arbitration')
);

Next Steps