Invoices API
The Ruby client library supports the invoice API for extracting data from invoices.
require 'mindee'
# Init a new client and configure the Invoice API
mindee_client = Mindee::Client.new.config_invoice(api_key: 'my-api-key')
invoice_data = mindee_client.doc_from_path('/path/to/the/invoice.pdf').parse('invoice')
puts invoice_data.document
Using this sample invoice below, we are going to illustrate how to extract the data that we want using the client library.
Response Objects
The response object is common to all documents, including custom documents. The main properties are:
document
— Document level predictionpages
— Page level predictionhttp_response
— Raw HTTP response
Document Level Prediction
The document
attribute is an Invoice
object which contains the data extracted from the entire document, all pages included.
It's possible to have the same field in various pages, but at the document level only the highest confidence field data will be shown (this is all done automatically at the API level).
For example, if you send a three-page invoice, the document level will provide you with one tax, one total, and so on.
# print the complete object
pp invoice_data.document
# print a summary of the document-level info
puts invoice_data.document
Output:
-----Invoice data-----
Invoice number: 0042004801351
Total amount including taxes: 587.95
Total amount excluding taxes: 489.97
Invoice date: 2020-02-17
Invoice due date: 2020-02-17
Supplier name: TURNPIKE DESIGNS CO.
Supplier address: 156 University Ave, Toronto ON, Canada M5H 2H7
Customer name: JIRO DOI
Customer company registration: FR00000000000; 111222333
Customer address: 1954 Bloon Street West Toronto, ON, M6P 3K9 Canada
Payment details: FR7640254025476501124705368;
Company numbers: 501124705; FR33501124705
Taxes: 97.98 20.0%
Total taxes: 97.98
Locale: fr; EUR;
--------------------
Page Level Prediction
The pages
attribute is an array holding Invoice
objects, holding the data extracted in each page of the document. All response objects have this property, regardless of the number of pages. Single page documents will have a single entry.
Iteration is done like any Ruby array:
invoice_data.pages.each do |page|
# as object, complete
pp page
# as string, summary
puts page
end
Raw HTTP Response
This contains the full Mindee API HTTP response. This can be useful for debugging.
# full HTTP request object
puts invoice_data.http_response
Extracted Fields
Each Invoice
object contains a set of different fields. Each Field
object contains at a minimum the following attributes:
value
(String or Float depending on the field type): corresponds to the field value. Can benil
if no value was extracted.confidence
(Float): the confidence score of the field prediction.bbox
(Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.polygon
(Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.reconstructed
(Boolean): True if the field was reconstructed or computed using other fields.
Additional Attributes
Depending on the field type, there might be additional attributes that will be extracted in the Invoice
object. Below is the list of basic fields that can be extracted:
- Orientation
- Customer Information
- Dates
- Locale and Currency
- Payment Information
- Supplier Information
- Taxes
- Total Amounts
Orientation
orientation
(Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:
- 0 degrees: the page is already upright
- 90 degrees: the page must be rotated clockwise to be upright
- 270 degrees: the page must be rotated counterclockwise to be upright
# To get the orientation of the 1st page
orientation = invoice_data.pages[0].orientation.degrees
Customer Information
customer_name
(Field): Customer's name
# To get the customer name
customer_name = invoice_data.document.customer_name.value
customer_address
(Field): Customer's postal address
# To get the customer address (String)
customer_address = document.customer_address.value
customer_company_registration
(Array): Customer's company registration
# To get the customer company registration
customer_company_registrations = document.customer_company_registration
Dates
Date fields:
- contain the
date_object
attribute, which is a standard Ruby date object - contain the
raw
attribute, which is the textual representation found on the document. - have a
value
attribute which is the ISO 8601 representation of the date, regardless of theraw
contents.
The following date fields are available:
date
: Date the invoice was issued
# To get the invoice date of issuance (string)
invoice_date = document.date.value
due_date
: Payment due date of the invoice.
# To get the invoice due date (string)
due_date = document.due_date.value
Locale
locale
[Locale]: Locale information.
locale.language
(String): Language code in ISO 639-1 format as seen on the document.
The following language codes are supported:ca
,de
,en
,es
,fr
,it
,nl
andpt
.
# To get the language code
language = invoice_data.document.locale.language
locale.currency
(String): Currency code in ISO 4217 format as seen on the document.
The following country codes are supported:CAD
,CHF
,GBP
,EUR
,USD
.
# To get the currency code
currency = invoice_data.document.locale.currency
locale.country
(String): Country code in ISO 3166-1 alpha-2 format as seen on the document.
The following country codes are supported:CA
,CH
,DE
,ES
,FR,
GB
,IT
,NL
,PT
andUS
.
# To get the country code
country = invoice_data.document.locale.country
Payment Information
payment_details
(Array< PaymentDetails >): List of invoice's supplier payment details. Each object in the list contains extra attributes:
iban
(String)
# Show the IBAN of the first payment
puts invoice_data.document.payment_details[0].iban
swift
(String)
# Show the SWIFT of the first payment
puts invoice_data.document.payment_details[0].swift
routing_number
(String)
# Show the routing number of the first payment
puts invoice_data.document.payment_details[0].routing_number
account_number
(String)
# Show the account number of the first payment
puts invoice_data.document.payment_details[0].account_number
Supplier Information
company_registration
(Array< CompanyRegistration >): List of detected supplier's company registration numbers. Each object in the list contains an extra attribute:
type
(String): Type of company registration number among: VAT NUMBER, SIRET, SIREN, NIF, CF, UID, STNR, HRA/HRB, TIN (includes EIN, FEIN, SSN, ATIN, PTIN, ITIN), RFC, BTW, ABN, UEN, CVR, ORGNR, INN, DPH, GSTIN, COMPANY REGISTRATION NUMBER (UK), KVK, DIC
# Show the type of the first registration
puts invoice_data.document.company_registration[0].type
value
(String): Value of the company identifier
# Show the value of the first registration
puts invoice_data.document.company_registration[0].value
supplier
: Supplier name as written in the invoice (logo or supplier Info).
# To get the supplier name
supplier_name = invoice_data.document.supplier.value
supplier_address
: Supplier address as written in the invoice.
# To get the supplier address
supplier_address = invoice_data.document.supplier_address.value
Taxes
taxes
(Array< TaxField >): Contains tax fields as seen on the receipt.
value
(Float): The tax amount.
# Show the amount of the first tax
puts invoice_data.document.taxes[0].value
code
(String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).
# Show the code of the first tax
puts invoice_data.document.taxes[0].code
rate
(Float): The tax rate.
# Show the rate of the first tax
puts invoice_data.document.taxes[0].rate
Total Amounts
total_incl
(Field): Total amount including taxes.
# To get the total amount including taxes value (float), ex: 14.24
total_incl = document.total_incl.value
total_excl
(Field): Total amount excluding taxes.
# To get the total amount excluding taxes value (float), ex: 10.21
total_excl = document.total_excl.value
total_tax
(Field): Total tax value from tax lines.
# To get the total tax amount value (float), ex: 8.42
total_tax = document.total_tax.value
Questions?
Join our Slack
Updated 20 days ago