Getting Started

This guide will help you get started with the Mindee Ruby client library to easily extract data from your documents.

The Ruby client supports Invoice, receipt, passport, OCR APIs and custom-built API from the API Builder.

You can view the source code on GitHub.

Prerequisite

Download and install Ruby. This library is officially supported on Ruby 2.6 to 2.7.

Installation

To quickly get started with the Ruby client library, Install by adding this line to your application's Gemfile:

gem 'mindee'

And then execute:

bundle install

Or you can install it like this:

gem install mindee

Finally, Ruby away!

Development Installation

If you'll be modifying the source code, you'll need to install the required libraries to get started.

We recommend using Bundler.

  1. First clone the repo.
git clone [email protected]:mindee/mindee-api-ruby.git
  1. Navigate to the cloned directory and install all required libraries.
cd mindee-api-ruby
bundle install

Updating Version

It is important to always check the version of the Mindee client library you are using, as new and updated features won’t work on old versions.

To upgrade the Mindee Ruby client library to the latest version, re-install the gem without specifying any version number.

gem install mindee

To upgrade Mindee Ruby client library to a specific version, re-install the gem and specify the version number.

gem install [email protected]<version>

Usage

To get started with Mindee's APIs, you need to create a Client and you're ready to go.

Let's take a deep dive into how this works.

The Client

The Client centralizes document configurations in a single object. Documents are added to the Client using a config_xxx method. Since each config_xxx method returns the current Client object, you can chain all the calls together.

The Client requires your API key. You can either pass these directly to the constructor or through environment variables. You only need to specify the API keys for the document endpoints you'll be using.

There are three ways to add documents to the client using the config-xxx method.

Single Document

You can have a separate client for each document. If you have only a single document type you're working with, this is the easiest way to get started.

require 'mindee'

# Init a new client and configure the Invoice API
mindee_client = Mindee::Client.new(api_key: 'my-api-key').config_invoice

# Load a file from disk and parse it
api_response = mindee_client.doc_from_path("/path/to/the/invoice.pdf").parse("invoice")

# Print a brief summary of the parsed data
puts api_response.document.to_s

Multiple Documents

You can have all your documents configured in the same client. If you're working with multiple document types this is the easiest way to get started. Since each config_xxx method returns the current client object, you can chain all the calls together.

You can also pass an API key for a specific document.

require 'mindee'

mindee_client = Mindee::Client.new(api_key: 'my-api-key')

mindee_client = mindee_client.config_invoice
                             .config_receipt(api_key: 'receipt-api-key')
                             .config_passport
                             .config_financial_doc
                             .config_custom_doc(
                               'wsnine',
                               'john'
                             )

Mix and Match

You can also mix and match. This approach is useful if you have a group of documents that needs to be handled in different ways.

require 'mindee'

financial_doc_client = Mindee::Client.new(api_key: 'my-api-key-1', raise_on_error: true)
                                     .config_financial_doc

passport_client = Mindee::Client.new(api_key: 'my-api-key-2', raise_on_error: false)
                                .config_passport

Environment Variables

API keys should be set as environment variables, especially for any production deployment.

The following environment variable will set the global API key:

MINDEE_API_KEY="global-api-key"

This is generally all you need to do, all your APIs will work!

Fine-grained Control

However you can also set specific keys as needed, much like the config_xxx functions.

For off-the-shelf APIs, here are the environment variables for the API keys you can set:

MINDEE_INVOICE_API_KEY="invoice-api-key"
MINDEE_RECEIPT_API_KEY="receipt-api-key"
MINDEE_PASSPORT_API_KEY="passport-api-key"

📘

Info

financial_doc is a mixed data flow of invoices and receipts. You'll need an API key for both receipt and invoice endpoints.

For custom documents, you can set also set the environment variables for the API keys. From the example above, we will have:

export MINDEE_JOHN_WSNINE_API_KEY="w9-form-api-key"

📘

Info

Order in which keys are applied:

  1. set in config_xxx function
  2. set in Client initialization
  3. set in MINDEE_XXXX_API_KEY specific environment variable
  4. set in MINDEE_API_KEY environment variable

Document Parsing

When parsing your document, the client calls the parse method, which return an object that you can serialize to the API. The document parse type must be specified when calling the parse method. The object containing the parsed data will be an attribute of the response object.

The different ways you can load and parse your data are through:

Path

This requires an absolute path, as a string.

invoice_response = mindee_client.doc_from_path("/path/to/the/invoice").parse("invoice")

# Print a summary of the parsed data
puts invoice_response.document.to_s

File Object

A normal Ruby file object with a path. Must be in binary mode.

receipt_response = nil
File.open(INVOICE_FILE, 'rb') do |fo|
    receipt_response = mindee_client.doc_from_file(fo, "invoice").parse("invoice")
end
     
# Print a summary of the parsed data
puts receipt_response.document.to_s

Base64

Requires a base64 encoded string.

Note: The original filename of the encoded file is required when calling the method.

b64_string = "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLD...."
receipt_response = mindee_client.doc_from_b64string(b64_string, "receipt.jpg").parse("receipt")

# Print a summary of the parsed data
puts receipt_response.document.to_s

Bytes

Requires raw bytes.

Note: The original filename is required when calling the method.

raw_bytes = b"%PDF-1.3\n%\xbf\xf7\xa2\xfe\n1 0 ob..."
invoice_response = mindee_client.doc_from_bytes(raw_bytes, "invoice.pdf").parse("invoice")

# Print a summary of the parsed data
puts invoice_response.document.to_s

Did this page help you?