Transact

⌘K
  1. Home
  2. Transact
  3. Features and Functions
  4. Semantik AI Engine
  5. Document Design Accelerator

Document Design Accelerator

Introduction

Applies to: Transact Version 2022.1.00 or newer. Table support applies to Transact versions 2022.1.01 or newer.

Availability: Windows and Linux on-premise or cloud.

This article describes how to use the Document Design Accelerator in Ephesoft Transact. The Document Design Accelerator feature is a batch class design tool that uses the Semantik Artificial Intelligence (AI) Engine to simplify the following when creating new document types:

  • Key-value extraction rules
  • Table extraction rules

This feature supports all languages that are supported by Transact. For a list, see Supported Languages.

Use Cases

The Document Design Accelerator is useful when you need to quickly configure a doctype including its Key-Value and Table attributes with their rules.

Licensing

Document Design Accelerator is included as part of the standard Transact license. There are no additional fees or licensing requirements.

Functional Limitations

  • Ephesoft recommends limiting the number of document types per batch class to fifty or fewer. If a larger number of document variants is expected, the best practice is to merge the rules for the variants into fewer document types.
  • Computer vision is used to identify Key-Value pairs and is language agnostic. The Semantik AI engine is trained primarily with invoice-type documents and mixed results may occur with other document types. Best results are obtained when Key-Value pairs are in close proximity to one another.
  • For Transact versions prior to 2022.1.01, Document Design Accelerator does not work with encrypted batch classes in Linux.
  • Review table extraction rules to make sure they meet your requirements and refine them as needed. A couple of items to watch for:
    • Simple tables are supported. Complex tables (those that include, but are not limited to, attributes such as multi-row headers and nested tables) may yield unexpected results.
    • Tables may be falsely detected or not at all detected because of the nature of machine-learning models.
    • Tables that span multiple pages of a document are considered to be separate tables.
    • There are no options to exclude columns or to select specific tables when using Semantik AI Engine.
  • Semantik AI Engine identifies tables by searching for visual elements such as headers, rows or visible/invisible boundaries, and other table-like structures.
  • Document Design Accelerator does not support Table Extraction in Transact 2022.1.00.
  • Table extraction results from Document Design Accelerator and Universal Document Automation may not match. Document Design Accelerator aids in the creation of conventional table extraction rules. Universal Document Automation uses the extraction results from the Semantik AI Engine.

Prerequisites

To use the Document Design Accelerator:

  • Install Transact version 2022.1.00 or newer. For table extraction, install Transact 2022.1.01 or later.
  • Download and install the Semantik AI Engine.

Use the Document Design Accelerator

This guide assumes you have already created a batch class and that you have set up a training document.

Here’s an overview of how to use the Document Design Accelerator:

  1. Create a new document type in your batch class.
  2. Train the learn file for classification.
  3. Test the extraction.
  4. If needed, adjust the extraction rules.

Note on the Training Documents

  • In Step 1, your sample document must be a populated form.
  • In Step 2, your learn file must be a blank (unpopulated) form.

You can use the same training document for the sample document and the learn file, but they must be configured differently for the document type and classification steps.

Step 1: Create a Document Type with Document Design Accelerator

  1. From the Batch Class Management screen, select your batch class and click Open.
  2. Drag and drop the sample document into the Create Document Type from Sample panel.

Warning: The sample document must be a populated form. The Document Design Accelerator will not create rules effectively if the user-entry/dynamic fields on the sample document have not been populated. Do not use the Document Design Accelerator with blank forms.

Figure 1: Select and upload files in the Create Document Type From Sample panel.

  1. Complete the Document Type Details window as follows.

Figure 2. Document Type Details Window (Transact version 2022.1.01 Shown)

    1. Enter the Name and Description of the new document type.Note: The document type name cannot be changed after it has been created.
    2. For Transact 2022.1.01 or newer, select what you want to detect with Semantik AI:
      • Key-Value Pairs
      • Tables
    3. Click OK. The new document type is added along with other Key-Value and Table attributes that Semantik AI has identified.Figure 3: New document type with index fields, tables, and extraction rules created by Document Design Accelerator.

Using the Semantik AI Engine, Transact will automatically create a new document type with index fields, tables, and extraction rules based on the data that was supplied in the sample.

For each Key-Value pair identified by the Semantik AI Engine, a new index field will be created, named after the key that was detected. If several identically named keys are detected, a unique index field name will be created for each one by appending a 1, 2, 3, etc. to the detected key name.

Each index field’s data type, field type, and regular expression pattern will be populated based on the values detected by the Semantik AI Engine in the sample document.

For each table identified by the Semantik AI Engine, a table entry, its column headers and extraction rules will be created. The table attributes will be populated based on the values detected by the Semantik AI Engine. Review table extraction rules to make sure they meet your requirements and refine them as needed.

Step 2: Train the Document for Classification

  1. Select the document type you created in Step 1.
  2. Train the document for classification using the method appropriate for your document content and use case. For more information, see Test Classification. In this tutorial, we use the Search Classification method.

Important: The classification learn file should be a blank (unpopulated) form, unlike the populated form used in the previous step.

Step 3: Test Extraction

  1. From the Batch Class Management screen, drag and drop your document into the Upload Test Extraction File(s) panel.
    Note: We recommend using a document different from the training document but from the same company or document issuer.
  2. Click Test Extraction.

    Figure 4. Test Extraction.
  3. Click Extract.

    Figure 5: Extract.
  4. Verify the extraction results from the index fields and extraction rules created by the Semantik AI Engine.

    Figure 6. Verify the extraction results.
  5. Verify the table extraction results with your refinements.
    Figure 7. Verify table extraction results.
  6. Click Close.

Step 4 (Optional): Adjust Extraction Rules if Needed

While the Document Design Accelerator produces accurate results automatically, you may need to make iterative adjustments and click Test Extraction again. To fine-tune extraction rules, use the standard Transact extraction rule edit tools. For help, see Create Extraction Rule (Key-Value Extraction Plugin) and Configure Table Extraction Rules.

How to Hide or Display the Create Document Type from Sample (Document Design Accelerator) Panel

This section describes how to hide or display the Create Document Type from Sample panel under Upload File(s). Note: You must be a Transact system administrator.

Hide the Panel

Figure 8: Create Document Type from Sample upload panel.

  1. Navigate to the <Ephesoft_Directory>\Application\WEB-INF\classes\META-INF folder.
  2. Open the application.properties file in your preferred text editor.
  3. Locate the following property:
    display_key_value_accelerator_sample_panel = yes
  4. Change the attribute from yes to no.
  5. Save the file.
  6. Restart Transact.

Re-display the Panel

To display the panel again:

  1. Complete steps 1-3 in Hide the Panel.
  2. Change the property attribute from no to yes.
  3. Save the file.
  4. Restart Transact.

Troubleshooting

Refer to the issues below for assistance in troubleshooting the Document Design Accelerator. For issues related to the Semantik AI Engine, see Troubleshooting (Semantik AI Engine).

Issue Possible Root Cause Solution
An error occurs when dragging a file into the Create Document Type from Sample panel. Semantik AI Engine isn’t installed and running. Contact your Ephsoft administrator to install and/or start the Semantik AI Engine service.
The Create Document from Sample panel is not visible. Your Transact instance is running a version prior to 2022.1.00.

or

Transact 2022.1.00 or later is running, however the display_key_value_accelerator_sample_panel property has not been enabled and Transact has not been restarted.

Contact your Ephesoft administrator and ask them to follow Hide or Display the “Create Document Type from Sample” Panel.
Document Design Accelerator returns few or no index fields. An unpopulated form was dragged into the Create New Document Type From Sample panel. Confirm that the sample used in Create a Document Type from Sample is a populated form. The Document Design Accelerator will not create index fields or extraction rules unless it detects both a key and a value pair.

Blank forms are likely to contain only keys and will not result in the automatic creation of index fields and extraction rules.

Documents of the document type created using the Document Design Accelerator are not being classified correctly. The document type was not trained for classification. Train the document for classification.