Commercial General Liability Section OCR

This article describes how to build an OCR API that extracts data from the Commercial General Liability Section using our deep learning engine. If you want to automate your workflow, this article is for you.

Prerequisites

  1. You’ll need a free account. Sign up and confirm your email to login.
  2. You’ll need at least 20 Commercial General Liability Section images or pdfs to train your OCR.

Define your Commercial General Liability Section use case

First, we’re going to define what fields we want to extract from your **Commercial General Liability Section.

**

Commercial Liability Section key data extractionCommercial Liability Section key data extraction

Commercial Liability Section key data extraction

General Information

Agency
Carrier
Agency Customer ID
Policy Number
Effective Date
Applicant
NAIC Code
Date

Coverages

Property Damage Deductibles
Bodily Injury Deductibles per claim
Bodily Injury Deductibles per occurrence

Limits

General Aggregate Limit
Products & Completed Operations Aggregate
Personal & Advertising Injury
Each Occurrence
Damage to Rented Premises (each occurrence)
Medical Expense (any other person)
Employee Benefit

Premium

Premises Operations Premium
Products Premium
Other Premium
Total Premium

Schedule of Hazards

Loc #
Haz #
Classification
Class Code
Premium basis
Exposure
Terr
Premops Rate

Claims Made

Proposed Retroactive Date
Entry Date

Employee Benefits Liability

Deductible per Claim
Number of Employees
Number of Employees covered
Retroactive Date

Contractors

Paid to Subcontractor
% of work subcontracted
Number of Full Time Staff
Number of Part-Time Staff

Products / Completed Operations

Product
Annual Gross Sales
Number of Units
Time to Market
Expected Life
Intended Use
Principal Component

​​​​​​​Additional Interest / Certificate Recipient

Name
Address
Rank
Reference

That’s it for our use case. Feel free to add any other relevant data to fit your requirements.

Deploy your API

Once you have defined the list of fields you want to extract from your Commercial General Liability Section, head over to the platform and press the ‘Create a new API’ button.

You land now on the setup page. Here is the image you can use to set up the API. For instance, my setup is as follows:
Set up your modelSet up your model

Set up your model

Once you’re ready, click on the “next” button. We are going to specify the data types for each of the fields we want our API to extract.

Define your modelDefine your model

Define your model

To move forward, you have two possibilities:

Upload a json config
Copy the following JSON into a file and upload it on the interface

{
  "problem_type": {
    "classificator": { "features": [], "features_name": [] },
    "selector": {
      "features": [
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "agency",
          "public_name": "Agency",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "carrier",
          "public_name": "Carrier",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "agency_customer_id",
          "public_name": "Agency Customer ID",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "policy_number",
          "public_name": "Policy Number",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "convention": "US" } },
          "handwritten": false,
          "name": "effective_date",
          "public_name": "Effective Date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "applicant",
          "public_name": "Applicant",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "naic_code",
          "public_name": "NAIC Code",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "convention": "US" } },
          "handwritten": false,
          "name": "date",
          "public_name": "Date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "property_damage_deductibles",
          "public_name": "Property Damage Deductibles",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "bodily_injury_deductibles_per_claim",
          "public_name": "Bodily Injury Deductibles per claim",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "bodily_injury_deductibles_per_occurrence",
          "public_name": "Bodily Injury Deductibles per occurrence",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "general_aggregate_limit",
          "public_name": "General Aggregate Limit",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "products_completed_operations_aggregate",
          "public_name": "Products & Completed Operations Aggregate",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "personal_advertising_injury",
          "public_name": "Personal & Advertising Injury",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "each_occurrence",
          "public_name": "Each Occurrence",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "damage_to_rented_premises_each_occurrence",
          "public_name": "Damage to Rented Premises (each occurrence)",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "medical_expense_any_other_person",
          "public_name": "Medical Expense (any other person)",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "employee_benefit",
          "public_name": "Employee Benefit",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "premises_operations_premium",
          "public_name": "Premises Operations Premium",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "products_premium",
          "public_name": "Products Premium",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "other_premium",
          "public_name": "Other Premium",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "total_premium",
          "public_name": "Total Premium",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "loc",
          "public_name": "Loc #",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "haz",
          "public_name": "Haz",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "classification",
          "public_name": "Classification",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "class_code",
          "public_name": "Class Code",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "premium_basis",
          "public_name": "Premium basis",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "exposure",
          "public_name": "Exposure",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "terr",
          "public_name": "Terr",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": 1 } },
          "handwritten": false,
          "name": "premops_rate",
          "public_name": "Premops Rate",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "convention": "US" } },
          "handwritten": false,
          "name": "proposed_retroactive_date",
          "public_name": "Proposed Retroactive Date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "convention": "US" } },
          "handwritten": false,
          "name": "entry_date",
          "public_name": "Entry Date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "deductible_per_claim",
          "public_name": "Deductible per Claim",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "number_of_employees",
          "public_name": "Number of Employees",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "number_of_employees_covered",
          "public_name": "Number of Employees covered",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": {} },
          "handwritten": false,
          "name": "retroactive_date",
          "public_name": "Retroactive Date",
          "semantics": "date"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "paid_to_subcontractor",
          "public_name": "Paid to Subcontractor",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": 1 } },
          "handwritten": false,
          "name": "of_work_subcontracted",
          "public_name": "% of work subcontracted",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": 1 } },
          "handwritten": false,
          "name": "number_of_full_time_staff",
          "public_name": "Number of Full Time Staff",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": 1 } },
          "handwritten": false,
          "name": "number_of_part_time_staff",
          "public_name": "Number of Part-Time Staff",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "product",
          "public_name": "Product",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "annual_gross_sales",
          "public_name": "Annual Gross Sales",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "number_of_units",
          "public_name": "Number of Units",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "time_to_market",
          "public_name": "Time to Market",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "expected_life",
          "public_name": "Expected Life",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "intended_use",
          "public_name": "Intended Use",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "principal_component",
          "public_name": "Principal Component",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": 0 } },
          "handwritten": false,
          "name": "name",
          "public_name": "Name",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "address",
          "public_name": "Address",
          "semantics": "word"
        },
        {
          "cfg": { "filter": { "is_integer": -1 } },
          "handwritten": false,
          "name": "rank",
          "public_name": "Rank",
          "semantics": "amount"
        },
        {
          "cfg": { "filter": { "alpha": -1, "numeric": -1 } },
          "handwritten": false,
          "name": "reference",
          "public_name": "Reference",
          "semantics": "word"
        }
      ],
      "features_name": [
        "agency",
        "carrier",
        "agency_customer_id",
        "policy_number",
        "effective_date",
        "applicant",
        "naic_code",
        "date",
        "property_damage_deductibles",
        "bodily_injury_deductibles_per_claim",
        "bodily_injury_deductibles_per_occurrence",
        "general_aggregate_limit",
        "products_completed_operations_aggregate",
        "personal_advertising_injury",
        "each_occurrence",
        "damage_to_rented_premises_each_occurrence",
        "medical_expense_any_other_person",
        "employee_benefit",
        "premises_operations_premium",
        "products_premium",
        "other_premium",
        "total_premium",
        "loc",
        "haz",
        "classification",
        "class_code",
        "premium_basis",
        "exposure",
        "terr",
        "premops_rate",
        "proposed_retroactive_date",
        "entry_date",
        "deductible_per_claim",
        "number_of_employees",
        "number_of_employees_covered",
        "retroactive_date",
        "paid_to_subcontractor",
        "of_work_subcontracted",
        "number_of_full_time_staff",
        "number_of_part_time_staff",
        "product",
        "annual_gross_sales",
        "number_of_units",
        "time_to_market",
        "expected_life",
        "intended_use",
        "principal_component",
        "name",
        "address",
        "rank",
        "reference"
      ]
    }
  }
}

Or build your data model manually
Using the interface, add up to your data model each field.

In our example, here are the different field configurations we used:

General Information

  • Agency: type String that never contains numeric characters.
  • Carrier: type String that never contains numeric characters.
  • Agency Customer ID: type String without specifications.
  • Policy Number: type String without specifications.
  • Effective Date: type Date with US format.
  • Applicant: type String that never contains numeric characters.
  • NAIC Code: type Number without specifications.
  • Date: type Date with US format.

Coverages

  • Property Damage Deductibles: type Number without specifications.
  • Bodily Injury Deductibles per claim: type Number without specifications.
  • Bodily Injury Deductibles per occurrence: type Number without specifications.

Limits​​​​​​​

  • General Aggregate Limit: type Number without specifications.
  • Products & Completed Operations Aggregate: type String without specifications.
  • Personal & Advertising Injury: type String without specifications.
  • Each Occurrence: type String without specifications.
  • Damage to Rented Premises (each occurrence): type String without specifications.
  • Medical Expense (any other person): type String without specifications.
  • Employee Benefit: type String without specifications.

​​​​​​​Premium

  • Premises Operations Premium: type Number without specifications.
  • Products Premium: type Number without specifications.
  • Other Premium: type Number without specifications.
  • Total Premium: type Number without specifications.

​​​​​​​Schedule of Hazards

  • Loc #: type Number without specifications.
  • Haz #: type Number without specifications.
  • Classification: type String that never contains numeric characters.
  • Class Code: type Number without specifications.
  • Premium basis: type String that never contains numeric characters.
  • Exposure: type Number without specifications.
  • Terr: type Number without specifications.
  • Premops Rate: type Number that is always an integer.

​​​​​​​Claims Made

  • Proposed Retroactive Date: type Date with US format.
  • Entry Date: type Date with US format.

​​​​​​​Employee Benefits Liability

  • Deductible per Claim: type Number without specifications.
  • Number of Employees: type Number without specifications.
  • Number of Employees covered: type Number without specifications.
  • Retroactive Date: type Date with US format.

​​​​​​​Contractors

  • Paid to Subcontractor: type Date with US format.
  • % of work subcontracted: type Number that is always an integer.
  • Number of Full Time Staff: type Date with US format.
  • Number of Part-Time Staff: type Date with US format.

​​​​​​​Products / Completed Operations

  • Product: type String that never contains numeric characters.
  • Annual Gross Sales: type Number without specifications.
  • Number of Units: type Number without specifications.
  • Time to Market: type Number without specifications.
  • Expected Life: type Number without specifications.
  • Intended Use: type String that never contains numeric characters.
  • Principal Component: type String that never contains numeric characters.

​​​​​​​Additional Interest / Certificate Recipient

  • Name: type String that never contains numeric characters.
  • Address: type String without specifications.
  • Rank: type Number without specifications.
  • Reference: type String without specifications.

Once you’re done setting up your data model, press the Start training your model button at the bottom of the screen.

Ready to train modelReady to train model

Ready to train model

Train your Commercial General Liability Section OCR

Train your modelTrain your model

Train your model

You’re all set!

Now is the time to train your Commercial General Liability Section deep learning model in the Training section of our API. This use case is quite well templated and you should get very high performances with the first training (20 data annotated) :)

In a few hours (minutes if you're fast), you’ll get your first model trained and will be able to use your custom OCR API for parsing the Commercial General Liability Section in your application.

To get more information about the training phase, please refer to the Getting Started tutorial. If you have any question regarding your use case, feel free to reach out on our chat!

Updated 4 months ago


Commercial General Liability Section OCR


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.