Receipts API

The Node.js SDK supports the receipt API for extracting data from receipts.

const { Client } = require("mindee");

const mindeeClient = new Client({
  receiptToken: "yourReceiptToken"
});

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
        filename : undefined,
        cutPdf : true,
        includeWords : false
    }
)
    .then((res) => {
        console.log(res.receipt);
    })
    .catch((err) => {
        console.error(err);
    });

📘

Info

You can also use an environment variable for the token

Client Receipt Parse Parameters

Parameter name

Description

Default value

input

Document object

inputType

path: File path
stream: From file object
base64: From a base64 encoded file

path

cutPdf

(Boolean) If set to true, when sending a multi pages pdf of more than 5 pages, the library create a new pdf by concatenating the first 4 pages and the last page.

true

includeWords

(Boolean) If set to true, the raw_http response will include all the words in the document associated to their positions.

false

filename

(String) Specify a filename of your input.

undefined

Using this sample receipt below, we are going to illustrate how to extract the data that we want using the SDK.
sample receiptsample receipt

Receipt Data structure

To access a receipt object, you need to create a mindee.Client and call the Client.receipt.parse method.

The receipt object JSON data structure consists of:

Document level prediction

For document-level prediction, we construct the document class by combining the different pages in a single document. This method used for creating a single receipt object from multiple pages relies on field confidence scores.

Basically, we iterate over each page, and for each field, we keep the one that has the highest probability.

For example, if you send a three-page receipt, the document level will provide you with one tax, one total, and so on.

res.receipt

Code Example

mindeeClient.receipt.parse({
    input : "receipt.jpg",
    inputType : 'path',
    filename : undefined,
    cutPdf : true,
    includeWords : false
})
.then((res) => {
  console.log(res.receipt);
})
.catch((err) => {
  console.error(err);
});

Output

receipt {
  filepath: '/var/folders/cf/06c5ln2s71jdwhb2r1cb82yh0000gn/T/upload_0db18418a792cafea3eada04dda37140receipt23.png',
  filename: 'upload_0db18418a792cafea3eada04dda37140receipt23.png',
  fileExtension: 'image/png',
  checklist: { taxesMatchTotalIncl: false },
  words: [],
  locale: {
    pageNumber: 0,
    reconstructed: false,
    value: 'en-US',
    probability: 0.89,
    bbox: [],
    language: 'en',
    country: 'US',
    currency: 'USD'
  },
  totalIncl: {
    pageNumber: 0,
    reconstructed: false,
    value: 7.27,
    probability: 0.99,
    bbox: [ [Array], [Array], [Array], [Array] ]
  },
  date: {
    pageNumber: 0,
    reconstructed: false,
    value: '2022-04-03',
    probability: 0.99,
    bbox: [ [Array], [Array], [Array], [Array] ],
    dateObject: '2022-04-03T00:00:00.000Z'
  },
  category: {
    pageNumber: 0,
    reconstructed: false,
    value: 'food',
    probability: 0.9,
    bbox: []
  },
  merchantName: {
    pageNumber: 0,
    reconstructed: false,
    value: 'MINDEE TAKE OUT',
    probability: 0.7,
    bbox: [ [Array], [Array], [Array], [Array] ]
  },
  time: {
    pageNumber: 0,
    reconstructed: false,
    value: '10:00',
    probability: 0.99,
    bbox: [ [Array], [Array], [Array], [Array] ]
  },
  taxes: [
    {
      pageNumber: 0,
      reconstructed: false,
      value: 0.41,
      probability: 0.98,
      bbox: [Array],
      code: 'TAX'
    }
  ],
  orientation: {
    pageNumber: 0,
    reconstructed: false,
    value: 0,
    probability: 0.99,
    bbox: []
  },
  totalTax: {
    pageNumber: 0,
    reconstructed: true,
    value: 0.41,
    probability: 0.98,
    bbox: []
  },
  totalExcl: {
    pageNumber: 0,
    reconstructed: true,
    value: 6.86,
    probability: 0.9702,
    bbox: []
  }
}

Page level prediction

We create the document class by iterating over each page one by one. Each page in the pdf is treated as a unique page.

For example, if you send a three-page receipt, page-level prediction will provide you with three tax, three total, and so on.

res.receipts

Code Example

mindeeClient.receipt.parse({
    input : "receipt.jpg",
    inputType : 'path',
    filename : undefined,
    cutPdf : true,
    includeWords : false
})
.then((res) => {
  console.log(res.receipts);
})
.catch((err) => {
  console.error(err);
});
```'
Output

```json
receipts [
  Receipt {
    filepath: '/var/folders/cf/06c5ln2s71jdwhb2r1cb82yh0000gn/T/upload_af2509b9d84a2f887b4dfcd7b3622c0freceipt23.png',
    filename: 'upload_af2509b9d84a2f887b4dfcd7b3622c0freceipt23.png',
    fileExtension: 'image/png',
    checklist: { taxesMatchTotalIncl: false },
    words: [],
    locale: Locale {
      pageNumber: 0,
      reconstructed: false,
      value: 'en-US',
      probability: 0.89,
      bbox: [],
      language: 'en',
      country: 'US',
      currency: 'USD'
    },
    totalIncl: Amount {
      pageNumber: 0,
      reconstructed: false,
      value: 7.27,
      probability: 0.99,
      bbox: [Array]
    },
    date: DateField {
      pageNumber: 0,
      reconstructed: false,
      value: '2022-04-03',
      probability: 0.99,
      bbox: [Array],
      dateObject: 2022-04-03T00:00:00.000Z
    },
    category: Field {
      pageNumber: 0,
      reconstructed: false,
      value: 'food',
      probability: 0.9,
      bbox: []
    },
    merchantName: Field {
      pageNumber: 0,
      reconstructed: false,
      value: 'MINDEE TAKE OUT',
      probability: 0.7,
      bbox: [Array]
    },
    time: Field {
      pageNumber: 0,
      reconstructed: false,
      value: '10:00',
      probability: 0.99,
      bbox: [Array]
    },
    taxes: [ [Tax] ],
    orientation: Orientation {
      pageNumber: 0,
      reconstructed: false,
      value: 0,
      probability: 0.99,
      bbox: []
    },
    totalTax: Amount {
      pageNumber: 0,
      reconstructed: true,
      value: 0.41,
      probability: 0.98,
      bbox: []
    },
    totalExcl: Amount {
      pageNumber: 0,
      reconstructed: true,
      value: 6.86,
      probability: 0.9702,
      bbox: []
    }
  }
]

Raw HTTP response

Get the full API response as a Node.js HTTP Response object.

mindeeClient.receipt.parse({
    input : "receipt.jpg",
    inputType : 'path',
    filename : undefined,
    cutPdf : true,
    includeWords : false
})
.then((res) => {
  console.log(res.receipt.httpResponse);
})
.catch((err) => {
  console.error(err);
});

Extracted Fields

Each receipt object contains a set of different fields. Each field contains the four following attributes:

  • value (String or Float depending on the field type): corresponds to the field value. Set to None if the field was not extracted.
  • probability (Float): the confidence score of the field prediction.
  • bbox (Array[Float]): contains the relative vertices coordinates of the bounding box containing the field in the image. If the field is not written, the bbox is an empty array.
  • reconstructed (Boolean): True if the field was reconstructed using other fields.

Depending on the Field type, there can be extra attributes.

Total amounts

receipt.totalIncl: Total amount including taxes
receipt.totalExcl: Total amount excluding taxes
receipt.totalTax: Total tax value reconstructed from tax lines

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.totalIncl);
    console.log(res.receipt.totalExcl);
    console.log(res.receipt.totalTax);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: 10.2,
  probability: 1,
  bbox: [
    [ 0.549, 0.619 ],
    [ 0.715, 0.619 ],
    [ 0.715, 0.64 ],
    [ 0.549, 0.64 ]
  ]
}
{
  pageNumber: 0,
  reconstructed: true,
  value: 8.5,
  probability: 1,
  bbox: []
}
{
  pageNumber: 0,
  reconstructed: true,
  value: 1.7,
  probability: 1,
  bbox: []
}

Taxes

receipt.taxes: Array of Tax fields

Each tax field has two extra attributes:

rate: (Float), Optional tax rate.
code: (String), Optional tax code. (HST, GST... for Canadian; City Tax, State tax for US, etc..)

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.taxes);
  })
  .catch((err) => {
    console.error(err);
  });
[
  {
    pageNumber: 0,
    reconstructed: false,
    value: 1.7,
    probability: 1,
    bbox: [ [Array], [Array], [Array], [Array] ],
    rate: 20
  }
]

Dates

receipt.date: Payment date of the receipt

Each date field comes with an extra attribute:

dateObject:: (Datetime), DateTime object

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.date);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: '2016-02-26',
  probability: 0.99,
  bbox: [
    [ 0.479, 0.173 ],
    [ 0.613, 0.173 ],
    [ 0.613, 0.197 ],
    [ 0.479, 0.197 ]
  ],
  dateObject: '2016-02-26T00:00:00.000Z'
}

Merchant name

receipt.merchantName: Supplier name as written in the receipt (logo)

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.merchantName);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: 'CLACHAN',
  probability: 0.71,
  bbox: [
    [ 0.394, 0.068 ],
    [ 0.477, 0.068 ],
    [ 0.477, 0.087 ],
    [ 0.394, 0.087 ]
  ]
}

Locale and currency

receipt.locale: Language ISO code

Contains three extra attributes:

language: (String), first 2 letters of language ISO code
country: (String), 2 letter abbreviation of country
currency: (String), ISO currency code

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.locale);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: 'en-GB',
  probability: 0.82,
  bbox: [],
  language: 'en',
  country: 'GB',
  currency: 'GBP'
}

Category

receipt.category: Receipt category among the list: toll, food, parking, transport, accommodation, gasoline, miscellaneous

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.category);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: 'food',
  probability: 0.99,
  bbox: []
}

Time

receipt.time: Time of purchase with 24 hours formatting (hh:mm)

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.time);
  })
  .catch((err) => {
    console.error(err);
  });
{
  pageNumber: 0,
  reconstructed: false,
  value: '15:20',
  probability: 0.99,
  bbox: [
    [ 0.62, 0.173 ],
    [ 0.681, 0.173 ],
    [ 0.681, 0.191 ],
    [ 0.62, 0.191 ]
  ]
}

Orientation

receipt.orientation: Rotation (in degrees) of receipt.

mindeeClient.receipt.parse(
    {
        input : "receipt.jpg",
        inputType : 'path',
    }
)
  .then((res) => {
    console.log(res.receipt.orientation);
  })
  .catch((err) => {
    console.error(err);
  });
{
    "pageNumber": 0,
    "reconstructed": false,
    "value": 0,
    "probability": 0.99,
    "bbox": []
  }

checklist

taxesMatchTotalIncl: (Boolean) verifies (tax rate)*totalExcl + totalExcl = TotalIncl


Questions?

Slack Logo IconSlack Logo Icon  Join our Slack


Did this page help you?