Output Formats

LeapOCR supports multiple output formats to match your processing needs. Choose the format that best fits your application architecture.

Format Types

Structured Format

Returns a single JSON object with extracted fields from the entire document.

Use case: Extract specific data points across a complete document

Example output:

{
  "invoice_number": "INV-2024-001",
  "total_amount": 1234.56,
  "invoice_date": "2024-01-15",
  "vendor_name": "ACME Corp",
  "line_items": [
    {
      "description": "Service Fee",
      "amount": 1000.0
    },
    {
      "description": "Tax",
      "amount": 234.56
    }
  ]
}

Best for:

Invoices
Forms with specific fields
Single-record documents
Database insertion

Markdown Format

Returns text content from each page in markdown format.

Use case: Convert documents to readable, formatted text

Example output:

# Invoice

**Invoice Number**: INV-2024-001
**Date**: January 15, 2024
**Total**: $1,234.56

## Line Items

- Service Fee: $1,000.00
- Tax: $234.56

Best for:

Document archival
Text analysis
Search indexing
Human-readable output

Per-Page Structured Format

Returns an array of JSON objects, one per page, with extracted fields.

Use case: Extract data from multi-section documents where each page has different content

Example output:

{
  "pages": [
    {
      "page_number": 1,
      "data": {
        "patient_name": "John Doe",
        "date_of_birth": "1980-05-15"
      }
    },
    {
      "page_number": 2,
      "data": {
        "medications": ["Aspirin", "Lisinopril"]
      }
    }
  ]
}

Best for:

Multi-page medical records
Forms with page-specific sections
Documents with varying layouts per page
Page-by-page processing pipelines

Comparison Table

Format	Output Type	Best For	Data Granularity
`structured`	Single object	Complete document data	Document-level
`markdown`	Text per page	Readable text conversion	Page-level text
`per_page_structured`	Array of objects	Multi-section documents	Page-level data

Choosing the Right Format

Need specific data extraction? → Use structured
Converting to text? → Use markdown
Page-specific processing? → Use per_page_structured

Output Formats

Output Formats

Format Types

Structured Format

Markdown Format

Per-Page Structured Format

Comparison Table

Choosing the Right Format

On this page