Invoice OCR Node.js
The Node.js OCR SDK supports the invoice API for extracting data from invoices.
Using this sample invoice below, we are going to illustrate how to extract the data that we want using the OCR SDK.
Quick Start
const mindee = require("mindee");
// for TS or modules:
// import * as mindee from "mindee";
// Init a new client
const mindeeClient = new mindee.Client({ apiKey: "my-api-key" });
// Load a file from disk and parse it
const doc = mindeeClient.docFromPath("/path/to/the/file.ext");
const respPromise = doc.parse(mindee.InvoiceV4);
// Print a summary of the parsed data
respPromise.then((resp) => {
if (resp.document === undefined) return;
console.log(resp.document.toString());
});
Output:
----- Invoice V4 -----
Locale: fr; fr; EUR;
Invoice number: 0042004801351
Reference numbers: AD29094
Invoice date: 2020-02-17
Invoice due date: 2020-02-17
Supplier name: TURNPIKE DESIGNS CO.
Supplier address: 156 University Ave, Toronto ON, Canada M5H 2H7
Supplier company registrations: 501124705; FR33501124705
Supplier payment details: FR7640254025476501124705368;
Customer name: JIRO DOI
Customer company registrations: FR00000000000; 111222333
Customer address: 1954 Bloon Street West Toronto, ON, M6P 3K9 Canada
Line Items:
Code | QTY | Price | Amount | Tax (Rate) | Description
| | | 4.31 | (2.1 %) | PQ20 ETIQ ULTRA RESIS METAXXDC
| 1.0 | 65.0 | 75.0 | 10.0 | Platinum web hosting package Dow...
XXX81125600010 | 1.0 | 250.01 | 275.51 | 25.5 (10.2 %) | a long string describing the ite...
ABC456 | 200.3 | 8.101 | 1622.63 | 121.7 (7.5 %) | Liquid perfection
| | | | | CARTOUCHE L NR BROTHER TN247BK
Taxes: 97.98 20.0%
Total taxes: 97.98
Total amount excluding taxes: 489.97
Total amount including taxes: 587.95
----------------------
Field Objects
Each Field
object contains at a minimum the following attributes:
value
(string or number depending on the field type):
Corresponds to the field value. Can benull
if no value was extracted.confidence
(Float):
The confidence score of the field prediction.bbox
(Array< Array< Float > >):
Contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.polygon
(Array< Array< Float > >):
Contains the relative vertices coordinates (points) of a polygon containing the field in the image.reconstructed
(Boolean):
True if the field was reconstructed or computed using other fields.
Extracted Fields
Attributes that will be extracted from the document and available in the Invoice
object:
- Orientation
- Customer Information
- Dates
- Locale and Currency
- Reference numbers
- Supplier Information
- Taxes
- Total Amounts
Orientation
orientation
(Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:
- 0 degrees: the page is already upright
- 90 degrees: the page must be rotated clockwise to be upright
- 270 degrees: the page must be rotated counterclockwise to be upright
// Show the orientation of each page
resp.pages.forEach((page) => {
console.log(page.orientation.value);
});
Customer Information
customerName
(Field): Customer's name
const customerName = resp.document.customerName.value
customerAddress
(Field): Customer's postal address
const customerAddress = resp.document.customerAddress.value
customerCompanyRegistrations
(Array< CompanyRegistration >): Customer's company registration
// Print all found customer registrations
resp.document.customerCompanyRegistration.forEach((registration) => {
console.log(
registration.type,
registration.value,
);
});
Dates
Date fields:
- contain the
dateObject
attribute, which is a standard JavaScript Date object - contain the
raw
attribute, which is the textual representation found on the document. - have a
value
attribute which is the ISO 8601 representation of the date, regardless of theraw
contents.
The following date fields are available:
date
: Date the invoice was issued
// To get the invoice date of issuance (string)
const invoicDate = resp.document.date.value
dueDate
: Payment due date of the invoice.
// To get the invoice due date (string)
const dueDate = resp.document.dueDate.value
Locale
locale
(Locale): Locale information.
locale.language
(String): Language code in ISO 639-1 format as seen on the document.
const language = resp.document.locale.language
locale.currency
(String): Currency code in ISO 4217 format as seen on the document.
// To get the currency code
const currency = resp.document.locale.currency
locale.country
(String): Country code in ISO 3166-1 alpha-2 format as seen on the document.
const country = resp.document.locale.country
Reference numbers
ReferenceNumbers
(Field) : Represents a list of Reference numbers including PO number.
Supplier Information
supplierCompanyRegistrations
(Array< CompanyRegistration >): List of detected supplier's company registration numbers. Each object in the list contains an extra attribute:
-
type
(String): Type of company registration number among: VAT NUMBER, SIRET, SIREN, NIF, CF, UID, STNR, HRA/HRB, TIN (includes EIN, FEIN, SSN, ATIN, PTIN, ITIN), RFC, BTW, ABN, UEN, CVR, ORGNR, INN, DPH, GSTIN, COMPANY REGISTRATION NUMBER (UK), KVK, DIC -
value
(String): Value of the company identifier
// Show all found company registrations
resp.document.companyRegistration.forEach((registration) => {
console.log(
registration.type,
registration.value
);
});
supplierName
: Supplier name as written in the invoice (logo or supplier Info).
const supplierName = resp.document.supplier.value
supplierAddress
: Supplier address as written in the invoice.
const supplierAddress = resp.document.supplierAddress.value
supplierPaymentDetails
(Array< PaymentDetails >): List of invoice's supplier payment details. Each object in the list contains extra attributes:
iban
(String)swift
(String)routingNumber
(String)accountNumber
(String)
// Show all found payment details
resp.document.paymentDetails.forEach((paymentDetail) => {
console.log(
paymentDetail.iban,
paymentDetail.swift,
paymentDetail.routingNumber,
paymentDetail.accountNumber,
);
});
Line items
lineItems
(Array): Line items details. Each object in the list contains :
productCode
(string)description
(string)quantity
(number)unitPrice
(number)totalAmount
(number)taxRate
(number)taxAmount
(number)confidence
(number)pageId
(number)polygon
(Polygon)
Taxes
taxes
(Array< TaxField >): Contains tax fields as seen on the receipt.
value
(Float): The tax amount.code
(String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).rate
(Float): The tax rate.
resp.document.taxes.forEach((tax) => {
console.log(
tax.value,
tax.rate,
tax.code,
);
});
Total Amounts
totalAmount
(Field): Total amount including taxes.
const totalAmount= resp.document.totalAmount.value
totalNet
(Field): Total amount excluding taxes.
const totalNet= document.totalNet.value
totalTax
(Field): Total tax value from tax lines.
const totalTax = document.totalTax.value
Questions?
Join our Slack
Updated 5 months ago