Proof of Address OCR
Extract recipient and issuer information from utility bills, tax returns, payslips, and more.
Using Mindee's Proof of Address API, you can automatically extract key information about the recipient or the issuer of a document to help you automate customer onboarding or KYC processes:
- Issuer Name
- Issuer Address
- Issuer Company Registrations numbers
- Recipient Name
- Recipient Address
- Recipient Company Registration numbers
- Issuance Date
- Dates
- Currency
- Language
- Orientation
Set up the API
Before making any API calls, you need to have created your API key.
- You'll need to get a utility bill, or any document containing an address block. You can use the following bill for your tests:
- Access your Passport API by clicking on the Proof of Address card in the APIs Store.
- From the left navigation, go to documentation > API reference, you'll find sample code in popular languages and the command line.
curl -X POST \
https://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict \
-H 'Authorization: Token my-api-key-here' \
-F document=@/path/to/your/file.png
import requests
url = "https://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict"
with open("/path/to/my/file", "rb") as myfile:
files = {"document": myfile}
headers = {"Authorization": "Token my-api-key-here"}
response = requests.post(url, files=files, headers=headers)
print(response.text)
// works for NODE > v10
const axios = require('axios');
const fs = require("fs");
const FormData = require('form-data')
async function makeRequest() {
let data = new FormData()
data.append('document', fs.createReadStream('./file.jpg'))
const config = {
method: 'POST',
url: 'https://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict',
headers: {
'Authorization':'Token my-api-key-here',
...data.getHeaders()
},
data
}
try {
let response = await axios(config)
console.log(response.data);
} catch (error) {
console.log(error)
}
}
makeRequest()
# tested with Ruby 2.5
require 'uri'
require 'net/http'
require 'net/https'
require 'mime/types'
url = URI("https://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict")
file = "/path/to/your/file.png"
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true
request = Net::HTTP::Post.new(url)
request["Authorization"] = 'Token my-api-key-here'
request.set_form([['document', File.open(file)]], 'multipart/form-data')
response = http.request(request)
puts response.read_body
<form onsubmit="mindeeSubmit(event)" >
<input type="file" id="my-file-input" name="file" />
<input type="submit" />
</form>
<script type="text/javascript">
const mindeeSubmit = (evt) => {
evt.preventDefault()
let myFileInput = document.getElementById('my-file-input');
let myFile = myFileInput.files[0]
if (!myFile) { return }
let data = new FormData();
data.append("document", myFile, myFile.name);
let xhr = new XMLHttpRequest();
xhr.addEventListener("readystatechange", function () {
if (this.readyState === 4) {
console.log(this.responseText);
}
});
xhr.open("POST", "https://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict");
xhr.setRequestHeader("Authorization", "Token my-api-key-here");
xhr.send(data);
}
</script>
- Replace my-api-key-here with your new API key, or use the select an API key feature and it will be filled automatically.
- Copy and paste the sample code of your desired choice in your application, code environment, terminal etc.
- Replace
/path/to/your/file/png
with the path to your document.
Always remember to replace your API key!
- Run your code. You will receive a JSON response with the document details.
API Response
Below is the full sample JSON response you get when you call the API. Since the response is quite verbose, we will walk through the fields section by section.
{
"api_request": {
"error": {},
"resources": [
"document"
],
"status": "success",
"status_code": 201,
"url": "http://api.mindee.net/v1/products/mindee/proof_of_address/v1/predict"
},
"document": {
"id": "ecdbe7bd-1037-47a5-87a8-b90d49475a1f",
"name": "sample_invoce.jpeg",
"n_pages": 1,
"is_rotation_applied": true,
"inference": {
"started_at": "2021-05-06T16:37:28",
"finished_at": "2021-05-06T16:37:29",
"processing_time": 1.125,
"pages": [
{
"id": 0,
"orientation": {"value": 0},
"prediction": { .. },
"extras": {}
}
],
"prediction": { .. },
"extras": {}
}
}
}
You can find the prediction within the prediction
key found in two locations:
- In
document > inference > prediction
for document-level predictions: it contains the different fields extracted at the document level, meaning that for multi-pages PDFs, we reconstruct a single document object using all the pages. - In
document > inference > pages[ ] > prediction
for page-level predictions: it gives the prediction for each page independently. With images, there is only one element on this array, but with PDFs, you can find the extracted data for each PDF page.
Each predicted field may contain one or several values:
- a
confidence
score - a
polygon
highlighting the information location - a
page_id
where the information was found (document level only)
{
"prediction": {
"recipient_company_registrations": [
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[ 0.515, 0.962 ], [ 0.59, 0.962 ], [ 0.59, 0.973 ], [ 0.515, 0.973 ]],
"type": "SIRET",
"value": "XXX81125600010"
},
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[ 0.658, 0.963 ], [ 0.729, 0.963 ], [ 0.729, 0.973 ], [ 0.658, 0.973 ]],
"type": "VAT NUMBER",
"value": "FR44837811XXX"
}
],
"recipient_name": {
"confidence": 0.84,
"page_id": 0,
"polygon": [[0.035, 0.284], [0.098, 0.284], [0.098, 0.296], [0.035, 0.296]],
"value": "JIRO DOI"
},
"recipient_address": {
"confidence": 0.3,
"page_id": 0,
"polygon": [[0.035, 0.304], [0.214, 0.304], [0.214, 0.353], [0.035, 0.353]],
"value": "1954 Bloon Street West Toronto, ON, M6P 3K9 Canada"
},
"issuer_company_registrations":[
{
"confidence": 0.84,
"page_id": 0,
"polygon": [[0.113, 0.251], [0.206, 0.251], [0.206, 0.266], [0.113, 0.266]],
"type": "TIN",
"value": "736952710"
}
],
"dates": [
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[0.842, 0.305], [0.931, 0.305], [0.931, 0.319], [0.842, 0.319]],
"value": "2018-09-25"
}
],
"issuance_date": {
"confidence": 0.99,
"page_id": 0,
"polygon": [[0.842, 0.305], [0.931, 0.305], [0.931, 0.319], [0.842, 0.319]],
"value": "2018-09-25"
},
"issuer_name": {
"confidence": 0.72,
"page_id": 0,
"polygon": [[0.164, 0.087], [0.4, 0.087], [0.4, 0.147], [0.164, 0.147]],
"value": "TURNPIKE DESIGNS CO."
},
"issuer_address": {
"confidence": 0.49,
"page_id": 0,
"polygon": [[0.756, 0.128], [0.964, 0.128], [0.964, 0.162], [0.756, 0.162]],
"value": "156 University Ave, Toronto ON, Canada M5H 2H7"
}
}
}
For each document, the following fields are extracted.
Recipient Information
- recipient_name: In the JSON response, we have the value of the recipient name as found on the document.
{
"recipient_name": {
"confidence": 0.84,
"page_id": 0,
"polygon": [[0.035, 0.284], [0.098, 0.298], [0.098, 0.296], [0.035, 0.296]],
"value": "JIRO DOI"
}
}
- recipient_address: In the JSON response, we have the value of the recipient address as found on the document.
{
"recipient_address": {
"confidence": 0.3,
"page_id": 0,
"polygon": [[0.035, 0.304], [0.214, 0.304], [0.214, 0.353], [0.035, 0.0353]],
"value": "1954 Bloon Street West Toronto, ON, M6P 3K9 Canada"
}
}
- recipient_company_registrations (string): In the JSON response below, we have the list of the recipient company identifiers. Each item may contain:
- type (String Generic): The following company registration numbers are supported: VAT NUMBER, SIRET, SIREN, NIF, CF, UID, STNR, HRA/HRB, TIN (includes EIN, FEIN, SSN, ATIN, PTIN, ITIN), RFC, BTW, ABN, UEN, CVR, ORGNR, INN, DPH, GSTIN, COMPANY REGISTRATION NUMBER (UK), KVK, DIC
- value (String): Value of the company identifier
{
"recipient_company_registrations": [
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[ 0.515, 0.962 ], [ 0.59, 0.962 ], [ 0.59, 0.973 ], [ 0.515, 0.973 ]],
"type": "SIRET",
"value": "XXX81125600010"
},
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[ 0.658, 0.963 ], [ 0.729, 0.963 ], [ 0.729, 0.973 ], [ 0.658, 0.973 ]],
"type": "VAT NUMBER",
"value": "FR44837811XXX"
}
]
}
Issuer Information
- issuer_company_registrations: List of detected issuerr's company registration numbers. Each company number object contains an extra attribute:
- type (String Generic): The following company registration numbers are supported: VAT NUMBER, SIRET, SIREN, NIF, CF, UID, STNR, HRA/HRB, TIN (includes EIN, FEIN, SSN, ATIN, PTIN, ITIN), RFC, BTW, ABN, UEN, CVR, ORGNR, INN, DPH, GSTIN, COMPANY REGISTRATION NUMBER (UK), KVK, DIC
- value (String): Value of the company identifier
{
"issuer_company_registrations": [
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[0.515, 0.962], [0.59, 0.962], [0.59, 0.973], [0.515, 0.973]],
"type": "SIRET",
"value": "XXX81125600010"
},
{
"confidence": 0.99,
"page_id": 0,
"polygon": [[0.658, 0.963], [0.729, 0.963], [0.729, 0.973], [0.658, 0.973]],
"type": "VAT",
"value": "FR44837811XXX"
}
]
}
- issuer_name: In the JSON response below, we have the value of the issuer name as written in the document.
{
"issuer_name": {
"confidence": 0.11,
"page_id": 0,
"polygon": [[0.165, 0.089], [0.385, 0.089], [0.385, 0.145], [0.165, 0.145]],
"value": "DESIGNS TURNPIKE CO"
}
}
- issuer_address: In the JSON response, we have the value of the issuer address as found on the document.
{
"issuer_address": {
"confidence": 0.49,
"page_id": 0,
"polygon": [[0.756, 0.128], [0.964, 0.128], [0.964, 0.162], [0.756, 0.162]],
"value": "156 University Ave, Toronto ON, Canada M5H 2H7"
}
}
Dates
- Issuance_date: In the JSON response below, we have the value of the issuance date in an ISO format (yyyy-mm-dd).
{
"issuance_date": {
"confidence": 0.99,
"page_id": 0,
"polygon": [[0.84, 0.305], [0.932, 0.305], [0.932, 0.318], [0.84, 0.318]],
"value": "2018-09-25"
}
}
- dates: In the JSON response below, we have the list of all dates extracted in the document in an ISO format(yyyy-mm-dd).
{
"due_date": {
"confidence": 0.86,
"page_id": 0,
"polygon": [[0.841, 0.323], [0.941, 0.323], [0.941, 0.338], [0.841, 0.338]],
"raw": "Upon receipt",
"value": "2018-09-25"
}
}
Locale
- locale: In the JSON response, we have the currency and language found on the document.
- language (String): Language code in ISO 639-1 format as seen on the document. The following language codes are supported:
ca
,de
,en
,es
,fr
,it
,nl
andpt
. - currency (String): Currency code in ISO 4217 format as seen on the document. The following country codes are supported:
USD
,EUR
,GBP
,CAD
,CHF
,AED
,AUD
,BRL
,CNY
,COP
,CZK
,DKK
,GNF
,HKD
,HUF
,JPY
,NOK
,NZD
,PLN
,SEK
,SGD
,XPF
- language (String): Language code in ISO 639-1 format as seen on the document. The following language codes are supported:
{
"locale": {
"confidence": 0.94,
"currency": "CAD",
"language": "en"
}
}
Orientation
- orientation: The orientation field is only available at the page level as it describes whether the page image should be rotated to be upright. The rotation value is also conveniently available in the JSON response at:
document > inference > pages [ ] > orientation > value
.
If the page requires rotation for correct display, the orientation field gives a prediction among these 3 possible outputs:- 0 degree: the page is already upright
- 90 degrees: the page must be rotated clockwise to be upright
- 270 degrees: the page must be rotated counterclockwise to be upright
In our example, the receipt doesn't require any rotation.
{
"orientation": {
"confidence": 0.99,
"degrees": 0
}
}
All
polygon
fields across the JSON response are already rotated accordingly!
Updated 8 months ago