Financial Documents OCR
Automatically extract data from unstructured financial documents
Ovierview
The Financial Document API is ideal for applications where you might receive both invoices and receipts and want a single integration point. The core functionality of this API involves a two-step process:
- Document Classification: Upon receiving a document, the API first analyzes it to determine whether it is an invoice or a receipt.
- Intelligent Routing: Based on the classification, the document is then automatically routed to the appropriate underlying API for detailed information extraction.
This approach simplifies your integration by eliminating the need to pre-classify documents on your end and choose which specific API to call.
Underlying APIs
The Financial Document API leverages the capabilities of our dedicated Invoice and Receipt APIs. Once a document is classified, it is processed by one of these APIs. You can find their respective documentation here:
Understanding the specific functionalities and responses of these underlying APIs can be helpful when working with the Financial Document API.
Financial Documents Data Fields Summary
Field Name | Description |
---|---|
billing_address | The address used for billing the customer. This is typically found on invoices. |
category | The main classification of the transaction or document (e.g., food , parking for receipts, miscellaneous for invoices). |
customer_address | The physical address of the customer. This is more commonly found on invoices. |
customer_company_registrations | An array containing registration details of the customer, such as VAT number or other tax identifiers. This is more common on invoices. |
customer_id | A unique identifier assigned to the customer by the supplier. This is more common on invoices. |
customer_name | The name of the customer. This is more commonly found on invoices but might appear on some receipts (e.g., for loyalty programs). |
date | The date the document was issued (invoice) or the transaction occurred (receipt). |
document_number | A general identifier for the document, which could be the invoice number or the receipt number. |
document_type | The broad type of the document (e.g., "INVOICE", "RECEIPT"). |
document_type_extended | A more specific classification of the document type (e.g., "NVOICE", "CREDIT CARD RECEIPT", "PAYSLIP" ...). |
due_date | The date by which payment is expected for an invoice. For receipts, this field will contain the transaction date. |
invoice_number | The unique identifier assigned to the invoice by the supplier. This field will typically be null for receipts. |
line_items | An array containing details of each item or service listed on the document, including description, quantity, unit price, and total amount. |
locale | Information about the language, country, and currency of the document (e.g., "en-US" for English, United States, USD). |
orientation | The detected orientation of the document. |
payment_date | The date when the payment was made. This might be present on both invoices and receipts, depending on the context. |
po_number | The purchase order number associated with the invoice, if available. This field will likely be null for receipts. |
receipt_number | The unique identifier printed on the receipt. This field will typically be null for invoices. |
reference_numbers | An array containing any additional reference numbers present on the document. |
shipping_address | The address where the purchased goods are to be shipped. This is typically found on invoices. |
subcategory | A more granular classification of the transaction or document (e.g. taxi under transport for receipts). |
supplier_address | The physical address of the supplier or merchant. |
supplier_company_registrations | An array containing registration details of the supplier, such as VAT number or other tax identifiers. |
supplier_email | The email address of the supplier or merchant. |
supplier_name | The name of the supplier or merchant. |
supplier_payment_details | An array containing details about how the supplier should be paid, such as bank account information. This is typically found on invoices. |
supplier_phone_number | The phone number of the supplier or merchant. |
supplier_website | The website address of the supplier or merchant. |
taxes | An array containing details of each tax applied to the document, including the tax code, rate, and amount. |
time | The time of the transaction, typically found on receipts. This field might be absent or null for invoices. |
tip | The amount of tip or gratuity paid, typically found on receipts. This field will likely be absent or null for invoices. |
total_amount | The final amount payable or paid, including all taxes and tips. |
total_net | The total amount before the application of taxes. |
total_tax | The total amount of tax applied to the document. |
Field-by-Field Explanation
This section provides details about specific fields within the Financial Document API response and highlights how their behavior might differ based on whether the processed document was classified as an invoice or a receipt.
Category & Subcategory
The category
field in the API response provides information about the type of the financial document. However, its interpretation depends on the document type:
- Invoice: For documents classified as invoices, the category field will always return the value
miscellaneous
. This is because our primary invoice processing focuses on extracting key financial details rather than granular categorization. - Receipt: For documents classified as receipts, the category field reflects the output of our dedicated receipt categorization model. This model analyzes the receipt content to determine the specific category (e.g.,
toll
,food
,parking
,transport
,accommodation
,gasoline
,telecom
,miscellaneous
.).
Due Date
The due_date
field represents the expected payment date for a financial document. However, its source and meaning differ slightly between invoices and receipts:
- Invoice: For invoices, the
due_date
field accurately reflects the actual due date explicitly stated on the invoice document. Our system extracts this information to provide you with the correct payment deadline. - Receipt: Receipts generally do not have a concept of a formal "due date." In the context of our Financial Document API, when a receipt is processed, the date of the transaction (the date printed on the receipt) is copied into the
due_date
field. This is done to maintain consistency across the API response structure. Therefore, for receipts, thedue_date
essentially represents the transaction date, not a payment deadline.
Updated 6 days ago