IND Passport - India OCR Python
The Python OCR SDK supports the Passport - India API.
Using the sample below, we are going to illustrate how to extract the data that we want using the OCR SDK.
Quick-Start
from mindee import Client, product, AsyncPredictResponse
# Init a new client
mindee_client = Client(api_key="my-api-key")
# Load a file from disk
input_doc = mindee_client.source_from_path("/path/to/the/file.ext")
# Load a file from disk and enqueue it.
result: AsyncPredictResponse = mindee_client.enqueue_and_parse(
product.ind.IndianPassportV1,
input_doc,
)
# Print a brief summary of the parsed data
print(result.document)
Output (RST):
########
Document
########
:Mindee ID: cf88fd43-eaa1-497a-ba29-a9569a4edaa7
:Filename: default_sample.jpg
Inference
#########
:Product: mindee/ind_passport v1.0
:Rotation applied: Yes
Prediction
==========
:Page Number: 1
:Country: IND
:ID Number: J8369854
:Given Names: JOCELYN MICHELLE
:Surname: DOE
:Birth Date: 1959-09-23
:Birth Place: GUNDUGOLANU
:Issuance Place: HYDERABAD
:Gender: F
:Issuance Date: 2011-10-11
:Expiry Date: 2021-10-10
:MRZ Line 1: P<DOE<<JOCELYNMICHELLE<<<<<<<<<<<<<<<<<<<<<
:MRZ Line 2: J8369854<4IND5909234F2110101<<<<<<<<<<<<<<<8
:Legal Guardian:
:Name of Spouse:
:Name of Mother:
:Old Passport Date of Issue:
:Old Passport Number:
:Address Line 1:
:Address Line 2:
:Address Line 3:
:Old Passport Place of Issue:
:File Number:
Field Types
Standard Fields
These fields are generic and used in several products.
BaseField
Each prediction object contains a set of fields that inherit from the generic BaseField
class.
A typical BaseField
object will have the following attributes:
- value (
Union[float, str]
): corresponds to the field value. Can beNone
if no value was extracted. - confidence (
float
): the confidence score of the field prediction. - bounding_box (
[Point, Point, Point, Point]
): contains exactly 4 relative vertices (points) coordinates of a right rectangle containing the field in the document. - polygon (
List[Point]
): contains the relative vertices coordinates (Point
) of a polygon containing the field in the image. - page_id (
int
): the ID of the page, alwaysNone
when at document-level. - reconstructed (
bool
): indicates whether an object was reconstructed (not extracted as the API gave it).
Note: A
Point
simply refers to a List of two numbers ([float, float]
).
Aside from the previous attributes, all basic fields have access to a custom __str__
method that can be used to print their value as a string.
ClassificationField
The classification field ClassificationField
does not implement all the basic BaseField
attributes. It only implements value, confidence and page_id.
Note: a classification field's
value is always a
str`.
DateField
Aside from the basic BaseField
attributes, the date field DateField
also implements the following:
- date_object (
Date
): an accessible representation of the value as a python object. Can beNone
.
StringField
The text field StringField
only has one constraint: its value is an Optional[str]
.
Attributes
The following fields are extracted for Passport - India V1:
Address Line 1
address1 (StringField): The first line of the address of the passport holder.
print(result.document.inference.prediction.address1.value)
Address Line 2
address2 (StringField): The second line of the address of the passport holder.
print(result.document.inference.prediction.address2.value)
Address Line 3
address3 (StringField): The third line of the address of the passport holder.
print(result.document.inference.prediction.address3.value)
Birth Date
birth_date (DateField): The birth date of the passport holder, ISO format: YYYY-MM-DD.
print(result.document.inference.prediction.birth_date.value)
Birth Place
birth_place (StringField): The birth place of the passport holder.
print(result.document.inference.prediction.birth_place.value)
Country
country (StringField): ISO 3166-1 alpha-3 country code (3 letters format).
print(result.document.inference.prediction.country.value)
Expiry Date
expiry_date (DateField): The date when the passport will expire, ISO format: YYYY-MM-DD.
print(result.document.inference.prediction.expiry_date.value)
File Number
file_number (StringField): The file number of the passport document.
print(result.document.inference.prediction.file_number.value)
Gender
gender (ClassificationField): The gender of the passport holder.
Possible values include:
- M
- F
print(result.document.inference.prediction.gender.value)
Given Names
given_names (StringField): The given names of the passport holder.
print(result.document.inference.prediction.given_names.value)
ID Number
id_number (StringField): The identification number of the passport document.
print(result.document.inference.prediction.id_number.value)
Issuance Date
issuance_date (DateField): The date when the passport was issued, ISO format: YYYY-MM-DD.
print(result.document.inference.prediction.issuance_date.value)
Issuance Place
issuance_place (StringField): The place where the passport was issued.
print(result.document.inference.prediction.issuance_place.value)
Legal Guardian
legal_guardian (StringField): The name of the legal guardian of the passport holder (if applicable).
print(result.document.inference.prediction.legal_guardian.value)
MRZ Line 1
mrz1 (StringField): The first line of the machine-readable zone (MRZ) of the passport document.
print(result.document.inference.prediction.mrz1.value)
MRZ Line 2
mrz2 (StringField): The second line of the machine-readable zone (MRZ) of the passport document.
print(result.document.inference.prediction.mrz2.value)
Name of Mother
name_of_mother (StringField): The name of the mother of the passport holder.
print(result.document.inference.prediction.name_of_mother.value)
Name of Spouse
name_of_spouse (StringField): The name of the spouse of the passport holder (if applicable).
print(result.document.inference.prediction.name_of_spouse.value)
Old Passport Date of Issue
old_passport_date_of_issue (DateField): The date of issue of the old passport (if applicable), ISO format: YYYY-MM-DD.
print(result.document.inference.prediction.old_passport_date_of_issue.value)
Old Passport Number
old_passport_number (StringField): The number of the old passport (if applicable).
print(result.document.inference.prediction.old_passport_number.value)
Old Passport Place of Issue
old_passport_place_of_issue (StringField): The place of issue of the old passport (if applicable).
print(result.document.inference.prediction.old_passport_place_of_issue.value)
Page Number
page_number (ClassificationField): The page number of the passport document.
Possible values include:
- 1
- 2
print(result.document.inference.prediction.page_number.value)
Surname
surname (StringField): The surname of the passport holder.
print(result.document.inference.prediction.surname.value)
Questions?
Updated 2 months ago