This plug-in performs the functionality of extracting the document level field’s value according to the regex pattern given. User will give a set of values as the regex pattern separated by semicolon. While extracting data, plugin will break the regex pattern with respect to semicolon and the last part is treated as the pattern. Plugin first match the last part, if it matches with some value found then all the other parts are searched going from right to left to the left of the value found. While the last part is compared as regex pattern, rest of the parts is compared as words. When all the parts are found then the value is extracted. If even any one value is not found then the value is not extracted.


Consider following value is specified for the pattern field of a document level field:

Invoice Date;\d{1,2}[/]\d{1,2}[/]\d{2,4}

Plugin will use last value in the semi-colon separated list, i.e., \d{1,2}\d{1,2}\d{2,4} for value extraction.

Consider following data is supplied as input data, i.e., present in an image:

Case 1: Input Data: Invoice Date 21/03/2012

Result: This will extract 21/03/2012 successfully as Date and Invoice both are found to the left of extracted value 21/03/2102.

Case 2: Input Data: Date 21/03/2012

Result: Regex pattern will be matched in this case but data won’t be extracted as Invoice is not found to the left of Date.


Plugin Configurations

Regular regex extraction can be configured at following UI:




Properties description:


[table caption=”” width=”800″ colwidth=”200|100|80|200″ colalign=”left|left|left|center|right”]
Configurable property,Type of value,Value options,Description
Regular Regex Extraction Switch,String,ON~~ OFF,The switch that describes that plug-in has to run or not.~~Default ON.
Regular Regex Confidence Score,Integer,0 – 100,Acts as a multiplier for the confidence score calculated by matching regex.



To add/edit the regular expression required for the Regular Regex Extraction, user needs Add/Edit the corresponding document level field at following UI:




Upon Adding/Editing the document level field, following screen will be presented where regular expression can be entered in Pattern field:





Following are few common error messages seen due to mal-functioning of the plugin:


[table caption=”” width=”800″ colwidth=”80|200|200″ colalign=”left|left|left|center|right”]
S no.,Error message,Possible root cause
1,Invalid input pattern sequence.,This occurs when the entered regex pattern is not a valid pattern or is not of proper format.
2,No FieldType data found from data base for document type,This happens when there is no field type initialized in a document.