Asynchronous Prediction

The Asynchronous Prediction endpoint is one of the two endpoints to extract information from your batch of documents. Via HTTPS, you send your document (binary file, base64, or URL) and receive the prediction in the response body as a JSON.

🚧

The Asynchronous Prediction endpoint is not available for all document parsing APIs, instead, you may use the Synchronous Prediction endpoint. Check your API Documentation on Mindee's Platform.

1. Send my document

URL

📘

POST /predict_async

https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async

To make asynchronous predictions, make sure your document parsing API supports asynchronous mode, then select:

  • <account> refers to the account name that owns the API:
    - For docTI APIs, this is your user name or organization name.
    - For Off-the-shelf APIs, use mindee as the account name.
  • <name>/<version> refers to the API name and preferred version as described in your API Documentation. For Off-the-shelf APIs, a new version may not be fully backward compatible and bring new features and better performance.

Prepare payload

The Prediction endpoint can handle three types of payload in order to send your document:

  • a binary file
  • a base64 encoded file
  • a URL

See Document inputs for more information on supported files.

Binary File

Use a multipart/form-data encoding to send your document:

import requests

url = "https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async"

with open("/path/to/my/file", "rb") as myfile:
    files = {"document": myfile}
    headers = {"Authorization": "Token <my-api-key-here>"}
    response = requests.post(url, files=files, headers=headers)
    print(response.text)
curl -X POST 
  https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async 
  -H 'Authorization: Token <my-api-key-here>' 
  -F document=@/path/to/your/file.png
using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;

class Program
{
    static void Main(string[] args)
    {
        var url = "https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async";
        var filePath = @"/path/to/my/file";
        var token = "my-api-key-here";

        var file = File.OpenRead(filePath);
        var streamContent = new StreamContent(file);
        var imageContent = new ByteArrayContent(streamContent.ReadAsByteArrayAsync().Result);
        imageContent.Headers.ContentType = MediaTypeHeaderValue.Parse("multipart/form-data");

        var form = new MultipartFormDataContent();
        form.Add(imageContent, "document", Path.GetFileName(filePath));

        var httpClient = new HttpClient();
        httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Token", token);
        var response = httpClient.PostAsync(url, form).Result;
        Console.WriteLine(response.Content.ReadAsStringAsync().Result);
    }
}

Base64 Encoded File

Prepare a JSON payload:

{
  "document": "/9j......"
}

Send your request with an application/json encoding:

curl -X POST \
  https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async \
  -H 'Authorization: Token <my-api-key-here>' \
  -H 'Content-Type: application/json' \
  -d 'document="/9j..."'

Public URL

Prepare a JSON payload with the URL included. Only valid public HTTPS links are accepted:

{
  "document": "https://mydomain.com/my_file.pdf"
}

Send your request with an application/json encoding:

curl -X POST \
  https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async \
  -H 'Authorization: Token <my-api-key-here>' \
  -H 'Content-Type: application/json' \
  -d '{"document":"https://mydomain.com/my_file.pdf"}'

JSON Response

See Endpoints for general description of Mindee's REST API response format.

Here is an example JSON Response from the asynchronous prediction endpoint, in case your document passes all the validation checks performed by Mindee API. (see Document Inputs for more information)

{
    "api_request": {
        "error": {},
        "resources": ["job"],
        "status": "success",
        "status_code": 202,
        "url": "https://api.mindee.net/v1/products/<account>/<name>/<version>/predict_async"
    },
    "job": {
        "available_at": null,
        "id": "072c509b-ea8e-491e-99e0-795c0be8c59c",
        "issued_at": "2024-02-23T16:35:50.364723",
        "status": "waiting",
        "error": {}
    }
}

Otherwise, see Error Management.

2. Get my prediction with Webhooks

See Webhooks to learn how to setup your webhook.

Once your webhook is correctly setup, you will start to receive your prediction directly on your endpoint as a POST request, with a JSON payload. You must return a successful request status_code (2xx). Any complex or time-consuming logic happening in your endpoint may cause a timeout.

JSON Response

When calling the prediction endpoint, the parsed information from your documents can be found in the document key.

When receiving the JSON payload on your webhook endpoint, the parsed information from your documents can be found in the document key. The job key contains some information about your asynchronous job.

{
    "document": {
        "id": "bb47bbab-8c97-4e83-a793-e401bdde3685",
        "inference": {
            "extras": {},
            "finished_at": "2024-02-23T16:35:54.909000",
            "is_rotation_applied": null,
            "prediction": { .. },
            "pages": [
              {
                "id": 0,
          	    "orientation": {"value": null},
          	    "extras": {},
          	    "prediction": { .. }
             },
              {
                "id": 1,
                "orientation": {"value": null},
                "extras": {},
                "prediction": { .. }
              }
            ],
            "processing_time": 2.623,
            "product": { .. },
            "started_at": "2024-02-23T16:35:50.364723"
        },
        "n_pages": 2,
        "name": "myfile.pdf"
    },
    "job": {
        "available_at": "2024-02-23T16:35:54.931372",
        "id": "072c509b-ea8e-491e-99e0-795c0be8c59c",
        "issued_at": "2024-02-23T16:35:50.364723",
        "status": "completed",
        "error": {}
    }
}

If the processing of your document has failed:

{
  "job": {
    "available_at": "2024-02-23T16:35:54.931372",
    "error": {
      "code": "ServerError",
      "details": "An error occurred",
      "message": "An error occurred"
    },
    "id": "072c509b-ea8e-491e-99e0-795c0be8c59c",
    "issued_at": "2024-02-23T16:35:50.364723",
    "status": "failed"
  }
}

Document

Describes the uploaded document

keytypedescription
idstringa unique identifier
namestringthe filename
n_pagesnumberthe number of pages
inferenceobjecta JSON object with the content of your inference (prediction)

Document > Inference

Contains the whole inference data (predictions)

keytypedescription
started_atstringthe date & time the inference has started in ISO 8601 format
finished_atstringthe date & time the inference was finished in ISO 8601 format
processing_timenumberthe request processing time in seconds
is_rotation_appliedboolean or nulltrue: polygons are already rotated given the page orientation
false: polygons are never rotated
null: the API has no orientation information
predictionobjecta JSON object with the document-level API prediction
pageslist[object]a JSON object with the page-level inference data

Document > Inference > Pages[ ]

Contains the page-level specific inference data (predictions)

keytypedescription
idnumberthe page index
orientation.valuenumberthe clockwise rotation to apply to get the page upright
Examples: 0, 90, 180, 270
predictionobjecta JSON object with the page-level API prediction

Job

keytypedescription
idstringa unique identifier
available_atstringthe date & time your predictions were available in ISO 8601 format
issued_atstringthe date & time your document was enqueued in ISO 8601 format
statusstringthe status of your document inference
Can be one of: completed, failed, processing, waiting
errorobjecta JSON object error information

Prediction example

Each API can describe several fields within its prediction object.

{
  "prediction": {
    "total_amount": {
      "value": 16.50,
    },
    "taxes": [
      {"value": 2.75, "rate": 20},
    ],
  }
}

Validate the HMAC Signature (Optional)

Mindee webhooks use a basic HMAC signature to ensure the integrity and authenticity of the data being transmitted. It is optional, but strongly recommended to verify this signature when receiving data, to secure your webhook endpoint by ensuring that all incoming requests are generated by Mindee.

To find the generated signature, look for the X-Mindee-Hmac-Signatureheader. You will also need your own Signing Secret from Mindee's Platform.

In the following code examples, SIGNATURE is the X-Mindee-Hmac-Signature, SECRET your Signing Secret and PAYLOAD the payload as received on your endpoint.

import hashlib
import hmac

PAYLOAD = ''
SECRET = ''
SIGNATURE = ''

mac = hmac.new(
    SECRET.encode("utf-8"),
    msg=PAYLOAD.encode("utf-8"),
    digestmod=hashlib.sha256,
).hexdigest()

is_valid = hmac.compare_digest(mac, SIGNATURE)
print(is_valid)
# should print True if the calculated hash equals to the received signature, False otherwise.

function try_verif() {
    const hash = crypto.createHmac(
        'sha256', SECRET
    ).update(PAYLOAD).digest('hex')

    console.log(hash === SIGNATURE)
}
// should log true if the calculated hash equals to the received signature,  false otherwise.
public String getHmacSignature(byte[] payload, String secretKey) {
  String algorithm = "HmacSHA256";
  SecretKeySpec secretKeySpec = new SecretKeySpec(
    secretKey.getBytes(StandardCharsets.UTF_8),
    algorithm
  );
  Mac mac;
  try {
    mac = Mac.getInstance(algorithm);
  } catch (NoSuchAlgorithmException err) {
    // this should never happen as the algorithm is hard-coded.
    return "";
  }
  try {
    mac.init(secretKeySpec);
  } catch (InvalidKeyException err) {
    return "";
  }
  return Hex.encodeHexString(mac.doFinal(payload));
}

public boolean isValidHmacSignature(String payload, String secretKey, String signature) {
  return signature.equals(getHmacSignature(
      payload.getBytes(StandardCharsets.UTF_8);,
      secretKey
  ));
}

3. Get my prediction with polling

If for any reason, you need to retrieve your document prediction manually, an alternative method is to use our polling endpoint.

Retrieve your job status

📘

GET /documents/queue/<job_id>

https://api.mindee.net/v1/products/<account>/<name>/<version>/documents/queue/<job_id>

While your job is still in waiting or processing status, you will receive a simple JSON response with information on your enqueued job, with a 200 status code.

{
    "api_request": { .. },
    "job": {
        "available_at": null,
        "id": "072c509b-ea8e-491e-99e0-795c0be8c59c",
        "issued_at": "2024-02-23T16:35:50.364723",
        "status": "processing",
        "error": {}
    }
}

Retrieve your document predictions

Once your job is completed, the endpoint will return a 302 Found redirect status code, redirecting you on https://api.mindee.net/v1/products/<account>/<name>/<version>/documents/<document_id>, with the final document_id provided.

The JSON response of this endpoint is exactly the same as what you would have received on your endpoint with a webhook enabled.

👍

Success

To know more about your document parsing API response, especially the prediction object's structure, you can access the Documentation part of your API on Mindee's platform.

Questions?
Slack Logo Icon  Join our Slack