Getting Started

This guide will help you get started with the Mindee Python OCR SDK to easily extract data from your documents.

The Python OCR SDK supports invoice, passport, receipt OCR APIs and custom-built API from the API Builder.

You can view the source code on GitHub, and the package on PyPI.

Prerequisite

  • Download and install Python. This library is officially supported on Python 3.7 to 3.10.
  • Download and install pip package manager.

Installation

To quickly get started with the Python OCR SDK anywhere, the preferred installation method is via pip.

pip install mindee

Development Installation

If you'll be modifying the source code, you'll need to install the development requirements to get started.

  1. First clone the repo.
git clone [email protected]:mindee/mindee-api-python.git
  1. Then navigate to the cloned directory and install all development requirements.
cd mindee-api-python
pip install -e ".[dev,test]"

Updating the Version

It is important to always check the version of the Mindee OCR SDK you are using, as new and updated features won’t work on old versions.

To check the installed version:

pip show mindee

To get the latest version:

pip install mindee --upgrade

To install a specific version:

pip install mindee==<your_version>

Usage

To get started with Mindee's APIs, you need to create a Client and you're ready to go.

Let's take a deep dive into how this works.

The Client

The Client requires your API key.

You can either pass the key directly to the constructor or through an environment variable.

In Constructor

from mindee import Client
#  Init with your API key
mindee_client = Client(api_key="my-api-key")

Environment Variable

API keys should be set as environment variables, especially for any production deployment.

The following environment variable will set the global API key:

MINDEE_API_KEY="my-api-key"

Then in your code:

from mindee import Client
#  Init without an API key
mindee_client = Client()

Document Parsing

When parsing your document, the client calls the parse method, which return an object that you can serialize to the API. The document parse type must be specified when calling the parse method. The object containing the parsed data will be an attribute of the response object.

The different ways you can load and parse your data are through:

Path

This requires an absolute path, as a string.

input_doc = mindee_client.doc_from_path("/path/to/the/invoice.pdf")

File Object

A normal Python file object with a path. Must be in binary mode.

with open("/path/to/the/receipt.jpg", 'rb') as fo:
     input_doc = mindee_client.doc_from_file(fo)

Base64

Requires a base64 encoded string.

Note: The original filename of the encoded file is required when calling the method.

b64_string = "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLD...."
input_doc = mindee_client.doc_from_b64string(b64_string, "receipt.jpg")

Bytes

Requires raw bytes.

Note: The original filename is required when calling the method.

raw_bytes = b"%PDF-1.3\n%\xbf\xf7\xa2\xfe\n1 0 ob..."
input_doc = mindee_client.doc_from_bytes(raw_bytes, "invoice.pdf")

Loading from bytes is useful when using FastAPI UploadFile objects.

@app.post("/process-file")
async def upload(upload: UploadFile):
    input_doc = mindee_client.doc_from_bytes(
        upload.file.read(),
        filename=upload.filename
    )

 

Questions?
Slack Logo IconSlack Logo Icon  Join our Slack


What’s Next