Overview
Certified for: Ephesoft Transact 2022.1.01 or above
The Auto Extraction plugin automatically generates key-value rules to extract data from a document based on values previously selected by operators on exported documents. This plugin is used with the Auto Export plugin.
Benefits of the Auto Extraction plugin include:
- Overriding the value populated by other extraction plugins based on the confidence score of the extracted value
- Marking a field as “force review” where the match level is below a configured threshold.
- Using a single rules database per batch class or a shared system wide database.
Installation
- In Ephesoft Transact, go to Administrator > System Configuration.
- Select Workflow Management.
- Drag and drop the ZIP file into the Import Plugin section.
Figure 1. Import Plugin
- Restart Ephesoft Transact.
Note: There is a known issue with the Infor IDM Export Plugins Phase 6 and earlier in that they remove an essential line from the applicationContext.xml file, located at [Ephesoft_Directory]\Application.
The following error may occur in the dcma-all.log file when you try to execute the plugin after restarting Transact:
Caused by: org.activiti.engine.ActivitiException: Unknown property used in expression: ${autokvextractionplugin.performAutoKVExtraction(batchInstanceID,key)}
To resolve this error, add the following line into the applicationContext.xml file, located at [Ephesoft_Directory]\Application:
<import resource="file:C:\Ephesoft\SharedFolders/customPluginJars/*.xml"/>
Note: The above line assumes Transact is installed on C:\Ephesoft\.
Configuration
- From the Batch Class Management page, select your batch class and click Open.
- Go to Modules > Extraction.
- Add the AUTO_KV_EXTRACTION plugin to the list of Selected Plugins.
Figure 2. Selected Plugins
- Click Deploy to update the workflow.
- Go to Modules > Extraction > AUTO_KV_EXTRACTION.
- Configure the plugin according to your requirements. Reference the table below for descriptions of the configurable properties.
Figure 3. Configurable Properties
Configurable Property | Description |
Auto Extraction Enabled | This switch enables the plugin. Set to True to enable. |
Enabled DLF List (or Batch Class Relative path to list) | Use this field to configure the list of index fields that should be monitored during validation. There are two available ways to list the index fields:
Note: If only one field is listed, it must be terminated with a pipe character ( | ).
|
Rules Filter Value DLF Name | Use this field to configure the rules to be filtered by a specific value, such as the Vendor ID, GST ID, or IBAN.
To create a filtered rule, enter the name of the index field by which the rule should be filtered. Note: Leaving this field blank will create rules without a filter. |
DLF Value Overwrite Mode | Select Overwrite or Do Not overwrite.
The threshold option will overwrite only if the confidence calculated by Transact of a field is lower than the value below. |
DLF Overwrite Confidence Threshold | This property is only used for the Threshold DLF Overwrite Mode. Otherwise, enter 0. |
DLF Force Review Threshold | Auto rules will be assigned a confidence based on Auto Confidence Values assignment logic.
If the confidence is below the value entered here, the DLF will be marked for operator review. |
Rules Database Path | This property defines the location of the database rules. There are three available ways to configure this property:
|
Auto Confidence Values Assignment Logic
The following Auto Rule Confidence values are assigned to Auto populated fields:
Confidence Value | Criteria |
100 |
|
90 |
|
85 |
|
85 |
|
75 |
|
50 |
|
20 |
|
Limitations
- In Ephesoft Transact 2019.2 or below, if you are using Format Conversion or a custom script to modify a DLF value, consider upgrading to 2020.1 or later. There is a known issue that the coordinates are removed from the DLF values after extraction so they no longer exactly match HOCR SPAN entries. This means an extraction rule cannot be automatically created.
- The same page of a document will be tested for a value that the rule was created from. For example, if a document has three pages and the total rule was created from a value on page 3, then the Auto Extraction plugin will only test page 3 for a value. Additional rules can be created for other pages.
- Extraction rules are not automatically created if the document type is changed between extraction and export.