Passport OCR Ruby

The Ruby OCR SDK supports the passport API for extracting data from passports.

require 'mindee'

# Init a new client and configure the passport API
mindee_client = 'your-api-key')
passport_data = mindee_client.doc_from_path('/path/to/the/passport.jpg').parse('passport')
puts passport_data.document

Using this sample passport below, we are going to illustrate how to extract the data that we want using the OCR SDK.
sample passportsample passport

Response Objects

The response object is common to all documents, including custom documents. The main properties are:

Document Level Prediction

The document attribute is a Passport object which contains the data extracted from the entire document, all pages included.

It's possible to have the same field on various pages, but at the document level only the highest confidence field data will be shown (this is all done automatically at the API level).

# print the complete object
pp passport_data.document

# print a summary of the document-level info
puts passport_data.document


-----Passport data-----
Filename: passport.jpeg
Given names: HENERT
Country: GBR
ID Number: 707797979
Issuance date: 2012-04-22
Birth date: 1995-05-20
Expiry date: 2017-04-22
MRZ 1: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<
MRZ 2: 7077979792GBR9505209M1704224<<<<<<<<<<<<<<00
MRZ: P<GBRPUDARSAN<<HENERT<<<<<<<<<<<<<<<<<<<<<<<7077979792GBR9505209M1704224<<<<<<<<<<<<<<00

Page Level Prediction

The pages attribute is an array holding Passport objects, holding the data extracted in each page of the document.
All response objects have this property, regardless of the number of pages. Single page documents will have a single entry.

Iteration is done like any Ruby array:

passport_data.pages.each do |page|
    # as object, complete
    pp page

    # as string, summary
    puts page

Raw HTTP Response

This contains the full Mindee API HTTP response. This can be useful for debugging.

# full HTTP request object
puts passport_data.http_response

Extracted Fields

Each Passport object contains a set of different fields. Each Field object contains at a minimum the following attributes:

  • value (String or Float depending on the field type): corresponds to the field value. Can be nil if no value was extracted.
  • confidence (Float): the confidence score of the field prediction.
  • bbox (Array< Array< Float > >): contains exactly 4 relative vertices coordinates (points) of a right rectangle containing the field in the document.
  • polygon (Array< Array< Float > >): contains the relative vertices coordinates (points) of a polygon containing the field in the image.
  • reconstructed (Boolean): True if the field was reconstructed or computed using other fields.

Additional Attributes

Depending on the field type specified, additional attributes can be extracted from the Passport object.

Using the above passport example, the following are the basic fields that can be extracted.


  • orientation (Orientation): The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright.

If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:

  • 0 degrees: the page is already upright
  • 90 degrees: the page must be rotated clockwise to be upright
  • 270 degrees: the page must be rotated counterclockwise to be upright
# To get the orientation of the 1st page
orientation = passport_data.pages[0].orientation.degrees

Birth Place

  • birth_place (Field): Passport owner birthplace.
# To get the passport's owner
birth_place = passport_data.document.birth_place.value


# To get the passport country code
country_code =


Date fields:

  • contain the date_object attribute, which is a standard Ruby date object
  • can contain the raw attribute, which is the textual representation found on the document.
  • have a value attribute which is the ISO 8601 representation of the date, regardless of the raw contents.

The following date fields are available:

  • expiry_date: Passport expiry date.
# To get the passport expiry date
expiry_date = passport_data.document.expiry_date.value
  • issuance_date: Passport date of issuance.
# To get the passport date of issuance
issuance_date = passport_data.document.issuance_date.value
  • birth_date: Passport's owner date of birth.
# To get the passport's owner date of birth
birth_date = passport_data.document.birth_date.value


  • gender (Field): Passport owner's gender (M / F).
# To get the passport owner gender, string among (M, F)
gender = passport_data.document.gender.value

Given Names

  • given_names (Array< Field >): List of passport owner's given names.
# To get the list of names
given_names = passport_data.document.given_names


  • id_number (Field): Passport identification number.
# To get the passport id number (string)
id_number = passport_data.document.id_number.value

Machine Readable Zone

  • mrz1 (Field): Passport first line of machine-readable zone.
# To get the passport  first line of machine readable zone (string)
mrz1 = passport_data.document.mrz1.value
  • mrz2 (Field): Passport second line of machine-readable zone.
# To get the passport full machine-readable zone (string)
mrz2 = passport_data.document.mrz2.value
  • mrz (Field): Reconstructed passport full machine-readable zone from mrz1 and mrz2.
# To get the passport full machine readable zone (string)
mrz = passport_data.document.mrz.value


  • surname (Field): Passport's owner surname.
# To get the passport's owner surname
surname = passport_data.document.surname.value


Slack Logo IconSlack Logo Icon  Join our Slack