Custom OCR Python

The Python OCR SDK supports custom-built API from the API Builder. If your document isn't covered by one of Mindee's Off-the-Shelf APIs, you can create your own API using the API Builder.

from mindee import Client

mindee_client = Client().config_custom_doc(
    document_type="wsnine",
    singular_name="w9",
    plural_name="w9s",
    account_name="john",
    api_key="w9-form-api-key",  # optional, can be set in environment
    # version="1.2",  # optional, see configuring client section below

)
# Load a file from disk and parse it
w9_data = mindee_client.doc_from_path(CUSTOM_API_FILE).parse("wsnine")

# Print a brief summary of the parsed data
print(w9_data.w9)

Configuring the Client

Below are the specification for custom endpoint configuration.

Arguments

Description

document_type

The document type is the API name from Settings page

singular_name

The name of the attribute used to retrieve a single document from the API response.

plural_name

The name of the attribute used to retrieve multiple documents from the API response.

account_name

Your organization's username in the API Builder.

api_key

Your API key for the endpoint. This can be set with an environment variable.

version

If set, locks the version of the model to use. If not set, uses the latest version of the model.

If this is set, you'll be required to update your code every time a new model is trained. This is probably not needed for development but essential for production use.

Environment Variables

API keys should be set as environment variables, especially for any production deployment. The environment variables can also be used for basic logging at various levels.

The format is

export MINDEE_<username>_<document_type>_API_KEY

where <username> and <document_type> are uppercase and any - replaced with _.

For the example above our environmental variable will be:

export MINDEE_JOHN_WSNINE_API_KEY="w9-form-api-key"

Parsing Documents

The client calls the parse method when parsing your custom document, which will return an object that you can send to the API. The document type must be specified when calling the parse method.

w9_data = mindee_client.doc_from_path("/path/to/receipt.jpg").parse("wsnine")
print(w9_data.w9)

πŸ“˜

Info

If your custom document has the same name as an off-the-shelf APIs document, you must specify your account name when calling the parse method.

from mindee import Client

mindee_client = Client().config_custom_doc(
    document_type="receipt",
    singular_name="receipt",
    plural_name="receipts",
    account_name="JohnDoe",
    api_key="johndoe-receipt-api-key",
)

receipt_data = mindee_client.doc_from_path("/path/to/receipt.jpg").parse("receipt", "JohnDoe")

Accessing Fields Values

The custom document object JSON data structure consists of:

For the following examples, we are using our own W9s custom API created with the API Builder.

πŸ“˜

Info

We used a data model that may be different from yours. To modify this to your own custom API, change the config_custom_doc call with your own parameters.

Document Level Prediction

For document-level prediction, we construct the document class by using the different pages put in a single document. The method used for creating a single invoice object with multiple pages relies on field confidence scores.

Basically, we iterate over each page, and for each field, we keep the one that has the highest probability.

print(w9_data.w9)

Page Level Prediction

For page level prediction, in multi-page PDFs, we construct the document class by using a unique page of the PDF.

for w9 in w9_data.w9s:
    print(w9)

Raw HTTP Response

This contains the full Mindee API HTTP response object in JSON format

# Using json.dumps function to display the fill HTTP response in a proper JSON format.
print(json.dumps(w9_data.http_response, indent=4, sort_keys=True))

Additional Fields

You can extract additional fields from your custom document. To do so, you'll need to specify the fields to be extracted from your document based on your data model.

The following is the list of fields we want to extract based on our own data model provided as an example: yours will be different. The object name of the fields are the same as the fields names from your data model.

..
            "features_name": [
                "name",
                "street_address",
                "city",
                "state",
                "zip_code",
                "social_security_number"
            ]
        }
    }
}

πŸ“˜

Info

The information for each field is an array as there is no post-processing of your documents. To access specific information for a specific page we can do a HTTP response.

City

The taxpayer's city.

for city in w9_data.w9.city["values"]:
    print(city["content"])

Name

The taxpayer's name.

for name in w9_data.w9.name["values"]:
    print(name["content"])

Social Security Number

The taxpayer's social security number

for social_security_number in w9_data.w9.social_security_number["values"]:
    print(social_security_number["content"])

State

The taxpayer's state.

for state in w9_data.w9.state["values"]:
    print(state["content"])

Street Address

The taxpayer's street address.

for street_address in w9_data.w9.street_address["values"]:
    print(street_address["content"])

Zip Code

The taxpayer's zip code.

for zip_code in w9_data.w9.zip_code["values"]:
    print(zip_code["content"])

Β 

Questions?
Slack Logo IconSlack Logo IconΒ Β Join our Slack


Did this page help you?