Invoice OCR Node.js

The Node.js OCR SDK supports the invoice API for extracting data from invoices.

Using this sample invoice below, we are going to illustrate how to extract the data that we want using the OCR SDK.
sample invoicesample invoice

Quick Start

import { Client,  InvoiceResponse } from "mindee";

// Init a new client
const mindeeClient = new Client({ apiKey: "my-api-key" });

// Load a file from disk and parse it
const doc = mindeeClient.docFromPath("./a74eaa5-c8e283b-sample_invoice.jpeg");
const respPromise = doc.parse(InvoiceResponse);

// Print a summary of the parsed data
respPromise.then((resp) => {
  if (resp.document === undefined) return;

  console.log(resp.document.toString());
});

Output:

-----Invoice data-----
Filename: a74eaa5-c8e283b-sample_invoice.jpeg
Invoice number: 14
Total amount including taxes: 2608.2
Total amount excluding taxes: 2415.0
Invoice date: 2018-09-25
Invoice due date: 2018-09-25
Supplier name: TURNPIKE DESIGNS CO.
Supplier address: 156 University Ave, Toronto ON, Canada M5H 2H7
Customer name: JIRO DOI
Customer company registration: 
Customer address: 1954 Bloon Street West Toronto, ON, M6P 3K9 Canada
Payment details: 
Company numbers: 
Taxes: 193.2 8.0%
Total taxes: 193.2
Locale: en; en; CAD;
----------------------

Field Objects

Each Field object contains at a minimum the following attributes:

  • value (string or number depending on the field type):
    Corresponds to the field value. Can be null if no value was extracted.
  • confidence (Float):
    The confidence score of the field prediction.
  • bbox (Array< Array< Float > >):
    Contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
  • polygon (Array< Array< Float > >):
    Contains the relative vertices coordinates (points) of a polygon containing the field in the image.
  • reconstructed (Boolean):
    True if the field was reconstructed or computed using other fields.

Extracted Fields

Attributes that will be extracted from the document and available in the Invoice object:

Orientation

  • orientation (Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.

If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:

  • 0 degrees: the page is already upright
  • 90 degrees: the page must be rotated clockwise to be upright
  • 270 degrees: the page must be rotated counterclockwise to be upright
// Show the orientation of each page
resp.pages.forEach((page) => {
  console.log(page.orientation.value);
});

Customer Information

  • customerName (Field): Customer's name
// To get the customer name
const customerName = resp.document.customerName.value
  • customerAddress (Field): Customer's postal address
// To get the customer address (String)
const customerAddress = resp.document.customerAddress.value
  • customerCompanyRegistration (Array< CompanyRegistration >): Customer's company registration
// Print all found customer registrations
resp.document.customerCompanyRegistration.forEach((registration) => {
  console.log(
    registration.type,
    registration.value,
  );
});

Dates

Date fields:

  • contain the dateObject attribute, which is a standard JavaScript Date object
  • contain the raw attribute, which is the textual representation found on the document.
  • have a value attribute which is the ISO 8601 representation of the date, regardless of the raw contents.

The following date fields are available:

  • date: Date the invoice was issued
// To get the invoice date of issuance (string)
const invoicDate = resp.document.date.value
  • dueDate: Payment due date of the invoice.
// To get the invoice due date (string)
const dueDate = resp.document.dueDate.value

Locale

locale (Locale): Locale information.

  • locale.language (String): Language code in ISO 639-1 format as seen on the document.
    The following language codes are supported: ca, de, en, es, fr, it, nl and pt.
// To get the language code
const language = resp.document.locale.language
  • locale.currency (String): Currency code in ISO 4217 format as seen on the document.
    The following country codes are supported: CAD, CHF, GBP, EUR, USD.
// To get the currency code
const currency = resp.document.locale.currency
  • locale.country (String): Country code in ISO 3166-1 alpha-2 format as seen on the document.
    The following country codes are supported: CA, CH, DE, ES, FR, GB, IT, NL, PT and US.
// To get the country code
const country = resp.document.locale.country

Payment Information

paymentDetails (Array< PaymentDetails >): List of invoice's supplier payment details. Each object in the list contains extra attributes:

  • iban (String)
  • swift (String)
  • routingNumber (String)
  • accountNumber (String)
// Show all found payment details
resp.document.paymentDetails.forEach((paymentDetail) => {
  console.log(
    paymentDetail.iban,
    paymentDetail.swift,
    paymentDetail.routingNumber,
    paymentDetail.accountNumber,
  );
});

Supplier Information

companyRegistration (Array< CompanyRegistration >): List of detected supplier's company registration numbers. Each object in the list contains an extra attribute:

// Show all found company registrations
resp.document.companyRegistration.forEach((registration) => {
  console.log(
    registration.type,
    registration.value
  );
});
  • supplier: Supplier name as written in the invoice (logo or supplier Info).
// To get the supplier name
const supplierName = resp.document.supplier.value
  • supplierAddress: Supplier address as written in the invoice.
// To get the supplier address
const supplierAddress = resp.document.supplierAddress.value

Taxes

taxes (Array< TaxField >): Contains tax fields as seen on the receipt.

  • value (Float): The tax amount.
  • code (String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).
  • rate (Float): The tax rate.
resp.document.taxes.forEach((tax) => {
  console.log(
    tax.value,
    tax.rate,
    tax.code,
  );
});

Total Amounts

  • totalIncl (Field): Total amount including taxes.
// To get the total amount including taxes value (float), ex: 14.24
const totalIncl = resp.document.totalIncl.value
  • totalExcl (Field): Total amount excluding taxes.
// To get the total amount excluding taxes value (float), ex: 10.21
const totalExcl = document.totalExcl.value
  • totalTax (Field): Total tax value from tax lines.
// To get the total tax amount value (float), ex: 8.42
const totalTax = document.totalTax.value

 

Questions?
Slack Logo IconSlack Logo Icon  Join our Slack


Did this page help you?