Receipt OCR Ruby

The Ruby OCR SDK supports the receipt API for extracting data from receipts.

require 'mindee'

# Init a new client and configure the receipt API
mindee_client = 'my-api-key')
receipt_data = mindee_client.doc_from_path('/path/to/the/receipt.jpg').parse('receipt')
puts receipt_data.document

Using this sample receipt below, we are going to illustrate how to extract the data that we want using the OCR SDK.
sample receiptsample receipt

Response Objects

The response object is common to all documents, including custom documents. The main properties are:

Document Level Prediction

The document attribute is a Receipts object which contains the data extracted from the entire document, all pages included.

It's possible to have the same field in various pages, but at the document level only the highest confidence field data will be shown (this is all done automatically at the API level).

# print the complete object
pp receipt_data.document

# print a summary of the document-level info
puts receipt_data.document


-----Receipt data-----
Total amount including taxes: 10.2
Total amount excluding taxes: 8.5
Date: 2016-02-26
Category: food
Time: 15:20
Merchant name: CLACHAN
Taxes: 1.7 20.0%
Total taxes: 1.7
Locale: en-GB; en; GB; GBP;

Page Level Prediction

The pages attribute is an array holding Receipt objects, holding the data extracted in each page of the document.
All response objects have this property, regardless of the number of pages. Single page documents will have a single entry.

Iteration is done like any Ruby array:

receipt_data.pages.each do |page|
    # as object, complete
    pp page

    # as string, summary
    puts page

Raw HTTP Response

This contains the full Mindee API HTTP response. This can be useful for debugging.

# full HTTP request object
puts receipt_data.http_response

Extracted Fields

Each Receipt object contains a set of different fields. Each Field object contains at a minimum the following attributes:

  • value (String or Float depending on the field type): corresponds to the field value. Can be nil if no value was extracted.
  • confidence (Float): the confidence score of the field prediction.
  • bbox (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
  • polygon (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
  • reconstructed (Boolean): True if the field was reconstructed or computed using other fields.

Additional Attributes

Depending on the field type specified, additional attributes can be extracted in the Receipt object.

Using the above receipt example, the following are the basic fields that can be extracted.


  • orientation (Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.

If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:

  • 0 degrees: the page is already upright
  • 90 degrees: the page must be rotated clockwise to be upright
  • 270 degrees: the page must be rotated counterclockwise to be upright
# To get the orientation of the 1st page
orientation = receipt_data.pages[0].orientation.degrees


  • category (Field): Receipt category as seen on the receipt.
    The following categories are supported: toll, food, parking, transport, accommodation, gasoline, miscellaneous.
# To get the category
category = receipt_data.document.category.value


Date fields:

  • contain the date_object attribute, which is a standard Ruby date object
  • contain the raw attribute, which is the textual representation found on the document.
  • have a value attribute which is the ISO 8601 representation of the date, regardless of the raw contents.

The following date fields are available:

  • date: Date the receipt was issued
# To get the receipt date of issuance
receipt_date =


locale (Locale): Locale information.

  • locale.value (String): Locale with country and language codes.
# To get the full locale
locale = receipt_data.document.locale
  • locale.language (String): Language code in ISO 639-1 format as seen on the document.
    The following language codes are supported: ca, de, en, es, fr, it, nl and pt.
# To get the language code
language = receipt_data.document.locale.language
  • locale.currency (String): Currency code in ISO 4217 format as seen on the document.
    The following country codes are supported: CAD, CHF, GBP, EUR, USD.
# To get the currency code
currency = receipt_data.document.locale.currency
  • (String): Country code in ISO 3166-1 alpha-2 format as seen on the document.
    The following country codes are supported: CA, CH, DE, ES, FR, GB, IT, NL, PT and US.
# To get the country code
country =

Supplier Information

  • supplier (Field): Supplier name as written in the receipt.
# To get the supplier name
supplier_name = receipt_data.document.supplier.value


taxes (Array< TaxField >): Contains tax fields as seen on the receipt.

  • value (Float): The tax amount.
# Show the amount of the first tax
puts receipt_data.document.taxes[0].value
  • code (String): The tax code (HST, GST... for Canadian; City Tax, State tax for US, etc..).
# Show the code of the first tax
puts receipt_data.document.taxes[0].code
  • rate (Float): The tax rate.
# Show the rate of the first tax
puts receipt_data.document.taxes[0].rate


  • time: Time of purchase as seen on the receipt
    • value (string): Time of purchase with 24 hours formatting (hh:mm).
    • raw (string): In any format as seen on the receipt.
# To get the time
time = receipt_data.document.time.value

Total Amounts

  • total_incl (Field): Total amount including taxes
# To get the total amount including taxes value

total_incl = receipt_data.document.total_incl.value
  • total_excl (Field): Total amount paid excluding taxes
# To get the total amount excluding taxes value

total_excl = receipt_data.document.total_excl.value
  • total_tax (Field): Total tax value from tax lines
# To get the total tax amount value

total_tax = receipt_data.document.total_tax.value


Slack Logo IconSlack Logo Icon  Join our Slack