Custom OCR Python
The Python OCR SDK supports custom-built API from the API Builder. If your document isn't covered by one of Mindee's Off-the-Shelf APIs, you can create your own API using the API Builder.
If your document isn't covered by one of Mindee's Off-the-Shelf APIs, you can create your own API using the
API Builder.
For the following examples, we are using our own W9s custom API,
created with the API Builder.
from mindee import Client, documents
# Init a new client and add your custom endpoint (document)
mindee_client = Client(api_key="my-api-key").add_endpoint(
account_name="john",
endpoint_name="wsnine",
# version="1.2", # optional, see configuring client section below
)
# Load a file from disk and parse it.
# The endpoint name must be specified since it can't be determined from the class.
result = mindee_client.doc_from_path(
"/path/to/the/w9.jpg"
).parse(documents.TypeCustomV1, endpoint_name="wnine")
# Print a brief summary of the parsed data
print(result.document)
Adding the Endpoint
Below are the arguments for adding a custom endpoint using the add_endpoint
method.
endpoint_name
: The endpoint name is the API name from Settings page
account_name
: Your organization's or user's name in the API Builder.
version
: If set, locks the version of the model to use, you'll be required to update your code every time a new model is trained.
This is probably not needed for development but essential for production use.
If not set, uses the latest version of the model.
Parsing Documents
The client calls the parse
method when parsing your custom document, which will return an object containing the prediction results of sent file.
The endpoint_name
must be specified when calling the parse
method for a custom endpoint.
result = mindee_client.doc_from_path("/path/to/receipt.jpg").parse(
documents.TypeCustomV1, endpoint_name="wnine"
)
print(result.document)
Info
If your custom document has the same name as an off-the-shelf APIs document,
you must specify your account name when calling theparse
method:
from mindee import Client, documents
mindee_client = Client(api_key="johndoe-receipt-api-key").add_endpoint(
endpoint_name="receipt",
account_name="JohnDoe",
)
result = mindee_client.doc_from_path("/path/to/receipt.jpg").parse(
documents.TypeCustomV1,
endpoint_name="wnine",
account_name="JohnDoe",
)
Document Fields
All the fields defined in the API Builder when creating your custom document are available.
In custom documents, each field will hold an array of all the words in the document which are related to that field.
Each word is an object that has the text content, geometry information, and confidence score.
Value fields can be accessed via the fields
attribute.
Classification fields can be accessed via the classifications
attribute.
Info
Both document level and page level objects work in the same way.
Fields Attribute
The fields
attribute is a dictionary with the following structure:
- key: the API name of the field, as a
str
- value: a
ListField
object which has avalues
attribute, containing a list of all values found for the field.
Individual field values can be accessed by using the field's API name, in the examples below we'll use the address
field.
# raw data, list of each word object
print(result.document.fields["address"].values)
# list of all values
print(result.document.fields["address"].contents_list)
# default string representation
print(str(result.document.fields["address"]))
# custom string representation
print(result.document.fields["address"].contents_string(separator="_"))
To iterate over all the fields:
for name, info in result.document.fields.items():
print(name)
print(info.values)
Classifications Attribute
The classifications
attribute is a dictionary with the following structure:
- key: the API name of the field, as a
str
- value: a
ClassificationField
object which has avalue
attribute, containing a string representation of the detected classification.
# raw data, list of each word object
print(result.document.classifications["doc_type"].values)
To iterate over all the classifications:
for name, info in result.document.classifications.items():
print(name)
print(info.values)
Questions?
Updated 5 months ago