Transact

  1. Home
  2. Transact
  3. Features and Functions
  4. Administrator Role and Features
  5. Modules and Plugins
  6. Extraction Module
  7. Regular Regex Extraction Plugin

Regular Regex Extraction Plugin

Available: on-premises, cloud

Overview

This plugin extracts index field values based on the pattern defined for that field.  A semicolon-separated collection of one or more words followed by a regular expression can be defined for the pattern.  The system will search each page for the regular expression.  If a match is found, the system will look to the left of the match and see if all of the preceding words in the pattern can be found.  If all of the words are found (in order), the value will be extracted.  If only a subset of the words are found, or if none of the words are found, the value will not be extracted.

Examples

Consider the following text defined for the pattern field of the InvoiceDate index field:  Invoice;Date;d{1,2}[/]d{1,2}[/]d{2,4}

Example 1

Text string in document:  Invoice Date 21/03/2012

Result: “21/03/2012” will be extracted for the InvoiceDate index field.  This happens because “21/03/2012” matches the regular expression pattern, with “Date” found to its left, and “Invoice” found to its left.

Example 2

Text string in document:  Date 21/03/2012

Result:  Nothing will be extracted for this index field.  Even though “21/03/2012” matches the regular expression, and “Date” is found to its left, the word “Invoice” is not found to the left of “Date.”

Plugin Configuration

The REGULAR_REGEX_EXTRACTION plugin can be configured in the following UI:

Plugin Configuration

Properties Description

Configurable propertyType of valueValue optionsDescription
Regular Regex Extraction SwitchList of Values
  • ON
  • OFF

 

This property determines if the plugin will run or not.

Default value is ON.

Regular Regex Confidence ScoreInteger0 – 100Acts as a multiplier for the confidence score calculated by matching regex.

 

The semicolon-separated set of words and regular expression can be entered in the Pattern column for each index field:

3

Troubleshooting

The following table lists possible error messages that could appear, and an explanation of what each error message means.

Error messagePossible root cause
Invalid input pattern sequence.The pattern entered is not a valid regular expression, or doesn’t match the proper format.
No FieldType data found from data base for document typeThe FieldType column doesn’t contain a valid value.
Was this article helpful to you? Yes No