Receipt OCR Node.js
The Node.js OCR SDK supports the receipt API for extracting data from receipts.
Using this sample receipt below, we are going to illustrate how to extract the data that we want using the OCR SDK.
Quick Start
const mindee = require("mindee");
// for TS or modules:
// import * as mindee from "mindee";
// Init a new client
const mindeeClient = new mindee.Client({ apiKey: "my-api-key" });
// Load a file from disk and parse it
const doc = mindeeClient.docFromPath("/path/to/the/file.ext");
const respPromise = doc.parse(mindee.ReceiptV4);
// Print a summary of the parsed data
respPromise.then((resp) => {
if (resp.document === undefined) return;
console.log(resp.document.toString());
});
Output:
----- Receipt V4 -----
Filename: ffc127d-sample_receipt.jpg
Total amount: 10.2
Total net: 8.5
Tip:
Date: 2016-02-26
Category: food
Subcategory: restaurant
Document type: EXPENSE RECEIPT
Time: 15:20
Supplier name: CLACHAN
Taxes: 1.7 20.0% VAT
Total taxes: 1.7
Locale: en-GB; en; GB; GBP;
----------------------
Field Objects
Each Field
object contains at a minimum the following attributes:
value
(string or number depending on the field type):
Corresponds to the field value. Can benull
if no value was extracted.confidence
(Float):
The confidence score of the field prediction.bbox
(Array< Array< Float > >):
Contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.polygon
(Array< Array< Float > >):
Contains the relative vertices coordinates (points) of a polygon containing the field in the image.reconstructed
(Boolean):
True if the field was reconstructed or computed using other fields.
Extracted Fields
Attributes that will be extracted from the document and available in the Receipt
object:
Orientation
orientation
(Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:
- 0 degrees: the page is already upright
- 90 degrees: the page must be rotated clockwise to be upright
- 270 degrees: the page must be rotated counterclockwise to be upright
// Show the orientation of each page
resp.pages.forEach((page) => {
console.log(page.orientation.value);
});
Category
category
(Field): Receipt category as seen on the receipt.
List of supported categories supported: https://developers.mindee.com/docs/receipt-ocr#category.
const category = resp.document.category.value
SubCategory
(Field): More precise subcategory.
List of supported subcategories supported: https://developers.mindee.com/docs/receipt-ocr#subcategory.
const category = resp.document.subcategory.value
DocumentType
(Field): Is a classification field of the receipt.
const category = resp.document.documentType.value
The document types supported: https://developers.mindee.com/docs/receipt-ocr#document-type
Date
Date fields:
- contain the
date_object
attribute, which is a standard Ruby date object - contain the
raw
attribute, which is the textual representation found on the document. - have a
value
attribute which is the ISO 8601 representation of the date, regardless of theraw
contents.
The following date fields are available:
date
: Date the receipt was issued
const receiptDate = resp.document.date.value
Locale
locale
(Locale): Locale information.
locale.language
(String): Language code in ISO 639-1 format as seen on the document.
const language = resp.document.locale.language
locale.currency
(String): Currency code in ISO 4217 format as seen on the document.
const currency = resp.document.locale.currency
locale.country
(String): Country code in ISO 3166-1 alpha-2 format as seen on the document.
const country = resp.document.locale.country
Supplier Information
supplier
(Field): Supplier name as written in the receipt.
const supplier = resp.document.supplier.value
Tip
tip
(Field): Total amount of tip and gratuity.
const tip = resp.document.tip.value
Taxes
taxes
(Array< TaxField >): Contains tax fields as seen on the receipt.
value
(Float): The tax amount.code
(String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).rate
(Float): The tax rate.basis
(Float): The tax base.
resp.document.taxes.forEach((tax) => {
console.log(
tax.value,
tax.rate,
tax.code,
tax.base,
);
});
Time
time
: Time of purchase as seen on the receiptvalue
(string): Time of purchase with 24 hours formatting (hh:mm).raw
(string): In any format as seen on the receipt.
const time = resp.document.time.value
Total Amounts
totalAmount
(AmountField): Total amount including taxes and tips
const totalAmount = resp.document.totalAmount.value
totalNet
(AmountField): Total amount paid excluding taxes and tip
const totalNet = resp.document.totalNet.value
totalTax
(AmountField): Total tax value from tax lines
const totalTax = resp.document.totalTax.value
Questions?
Join our Slack
Updated 2 months ago