French health insurance card OCR

This article explains how to extract all the relevant data from French health insurance cards (carte mutuelle tiers payant) using our deep learning OCR engine.

Prerequisites

  1. You’ll need a free account. Sign up and confirm your email to login.
  2. You’ll need at least 20 Health insurance card images or pdfs to train your OCR.

Define the data you need from Health insurance cards

We need first to set up our use case by defining what fields we want to extract from your health insurance cards.

French health insurance key data extraction

In this article, we are going to set up an API for extracting the following fields:

  • AMC Number: The AMC number of the cardholder
  • Member identification number: The unique member identification number (numéro adhérent)
  • Teletransmission number
  • Insured full name: Full name of the first insured. You can add more fields of this type if you want to extract more of them.
  • Insured social security number: SSN of the first insured.
  • Validity start date
  • Validity end date

You can add as many fields as you want to fit your requirements.

Deploy your Health Insurance card API

The field list was defined and we are now going to set up our API. Head over to the platform and press the ‘Create a new API’ button.

You land now on the setup page. Here is the image you can use to set up the API. For instance, my setup is as follows:

Set up your API

Click on the “next” button. You land on a new page where we are going to add the technical definitions of the fields we defined above.

Define your model

To move forward, you have two possibilities:

Upload a json config
Copy the following JSON into a file and upload it on the interface

{
  "problem_type": {
    "classificator": { "features": [], "features_name": [] },
    "selector": {
      "features": [
        {
          "cfg": { "filter": { "alpha": 0, "numeric": -1 } },
          "handwritten": false,
          "name": "amc_number",
          "public_name": "AMC Number",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": 0, "numeric": -1 } },
          "handwritten": false,
          "name": "member_identification_number",
          "public_name": "Member identification number",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": 0, "numeric": -1 } },
          "handwritten": false,
          "name": "teletransmission_number",
          "public_name": "Teletransmission number",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "insured_full_name",
          "public_name": "Insured full name",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "insured_social_security_number",
          "public_name": "Insured social security number",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "convention": "FR" } },
          "handwritten": false,
          "name": "validity_start_date",
          "public_name": "Validity start date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "convention": "FR" } },
          "handwritten": false,
          "name": "validity_end_date",
          "public_name": "Validity end date",
          "semantics": "date"
        }
      ],
      "features_name": [
        "amc_number",
        "member_identification_number",
        "teletransmission_number",
        "insured_full_name",
        "insured_social_security_number",
        "validity_start_date",
        "validity_end_date"
      ]
    }
  }
}

Or build your data model manually
Using the interface, add up to your data model each field.

In our example, here are the different field configurations we used:

  • AMC Number: type String with no alpha characters
  • Member identification number: type String with no alpha characters
  • Teletransmission number: type String with no alpha characters
  • Insured full name: type String with no numeric characters
  • Insured social security number: type String without any specification
  • Validity start date: type Date
  • Validity end date: type Date

That's it for the setup phase. Let's deploy your OCR.

Ready to train model

Train your health insurance card model

Train your model

Your setup is done. You can now deploy your API and train your deep learning OCR to extract data from your health insurance cards using the API.

To get more information about the training phase, please refer to the Getting Started tutorial. If you have any question regarding your use case, feel free to reach out on our chat!

Updated 28 days ago


French health insurance card OCR


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.