Revision 2 — March 16, 2020

Overview of the RECOSTAR_EXTRACTION Plugin

The RECOSTAR_EXTRACTION plugin is a part of the Extraction module, by default. This plugin extracts data from the fields that are contained in a document.

  • The fields in a document are defined by the Document_Assembler plugin in the Document Assembly module.
  • The RECOSTAR_EXTRACTION plugin extracts data values from these document-level fields in the Extraction module.

Note: The RecoStar extraction plugin supports extraction on Windows installations of Ephesoft Transact.
For Linux installations, extraction is performed by the Nuance extraction plugin.
Refer to a separate Wiki article in the case of extraction on Linux installations.

The following snapshot illustrates typical components of the Extraction module, including the RECOSTAR_EXTRACTION plugin:

Extraction Module with RECOSTAR_EXTRACTION Plugin

When using this plugin, the document-level fields are populated by reading an XML file that the RecoStar OCR engine creates. RecoStar uses an .rsp project file that is configured for the document’s pages.

The .rsp file used by the RECOSTAR_EXTRACTION plugin is normally located in the following path:

  • SharedFolder{Batch Class}fixed-form -extraction*.rsp

Otherwise, the RSP project file in located in the bin folder, as follows:

  • {Application}nativeRecostarPluginbin*.rsp

The following illustration contains three fields (circled) as an example of how to identify fields in the .rsp file.

Sample RSP file in RecoStar with three sample Fields identified

For additional information about creating the RSP file for extraction with the RECOSTAR_EXTRACTION plugin, refer to the following resources:

  • Ephesoft Fixed Form Tutorial

Configuring the RECOSTAR_EXTRACTION Plugin

Perform these steps to configure the RECOSTAR_EXTRACTION plugin in the Extraction module:

Note: The Administrator user account is required for this procedure.

1. Launch the Ephesoft Transact application and select Administrator > Batch Class Management.

The system prompts you to log in. Provide login parameters as prompted.

The Batch Class Management screen appears, displaying all the batch classes currently contained in Transact.

Batch Class Management screen

2. Open the batch class to be configured. Select (check) the batch click and click Open.

3. In the navigation pane on the left side, expand the Modules section, and click Extraction to display the plugins currently configured for the Extraction module.

Extraction Module and Plugins

4. Click (highlight) the RECOSTAR_EXTRACTION plugin. The Plugin Configuration screen appears on the right.

Plugin Configuration options for the RECOSTAR_EXTRACTION Plugin

5. Define the following settings for the RECOSTAR_EXTRACTION plugin:

Configurable Property Options Description
RecoStar Extraction color switch • ON
• Set the color switch to ON to use a PNG input file for OCR (optical character recognition).
• Set the color switch to OFF to use a TIFF input file for OCR.
RecoStar Auto Rotate switch • ON
Use this property to apply auto-rotation of the input images during OCR, based on the orientation provided by the RecoStar OCR engine.
RecoStar Extraction switch • ON
Use this switch to enable or disable this plugin.
Retain Intermediate File • ON
This switch was introduced in Ephesoft Transact (March 2018) and is available in subsequent releases. If enabled (ON), this setting deletes the XML file once batch execution and extraction are complete. If disabled (OFF), Transact retains this intermediate XML file even after batch processing is complete.

6. Click Apply to save the changes. Click Deploy to activate the changes, making them immediately applicable to batch class processing. Click Close to exit the Plugin Configuration screen.

7. Evaluate certain additional settings with regard to this plugin. Make additional changes in the batch class as needed. Following these guidelines:

  • This plugin only requires an image as an input, which is a PNG file if the color switch is ON, or a TIFF file if the color switch is OFF.
  • Therefore, the administrator requires one of the following additional plugins:
    • Either the CREATE_OCR_INPUT plugin or the CREATE_DISPLAY_IMAGE plugin is required.
      • One of these plugins must execute before this RecoStar Extraction plugin.
      • These plugins are typically located in the Page Process module, which comes before the Extraction module.
    • Ideally, one should place the RecoStar Extraction plugin after the page process and document classification plugins, and that the RecoStar Extraction plugin not execute until after the Review stage has been completed.
    • The RecoStar Extraction plugin requires a valid document type to be classified for the batch.

RecoStar Extraction Dependencies

RecoStar Extraction Dependency on the RECOSTAR_HOCR Plugin

If you are using the RECOSTAR_HOCR plugin in your batch class, which is typically in the Page Process module, in combination with the RecoStar Extraction plugin, which is typically in the Extraction module, the configuration in the UI for these two plugins must match with regard to using color documents.

If the color switch is turned on in the RecoStar HOCR plugin, the same switch must be turned on in the RecoStar Extraction plugin.

RecoStar Extraction Dependency on the Project File

Apart from the above-mentioned properties, there is a major configuration associated with this plugin. RecoStar extracts values according on the project file being used. Therefore, the project file is the important file for this plugin.

  • Because the project file maps document-level fields with appropriate values (or patterns or barcodes) for extraction, the project file is purely specific to the document type.
  • Thus, instead of specifying the project file name at the plugin level, one needs to specify the project file name for each document type.
  • This mapping of each document type with the project file is provided in the following location:

BatchClassList > BatchClass > DocumentTypes on the Batch Class Management screen

All .rsp files located in the fixed-form-extraction folder (full path: SharedFolders\{BatchClassFolder}\fixed-form-extraction\{DocumentName) appear in the dropdown menu.

Note: In Ephesoft Transact versions 2019.2 and above, the dropdown menu will also display .rsp files located in the common-project-files folder (full path: SharedFolders\{BatchClassFolder}\fixed-form-extraction\common-project-files). This allows users to apply the same .rsp file to more than one document type in a batch class.

The administrator can select the appropriate project file (.rsp file) in the following property:

  • Form Processing Project file — refer to the following image:
Form Processing Project File

RecoStar Extraction Dependency on Shared Folders

The batch class folder present inside the shared folder contains a folder by the name: ‘recostar-extraction’

This folder contains the .rsp project files which a user can use to map the document type (for RecoStar extraction).

Troubleshooting RecoStar Extraction

Use the following table to identify and resolve possible errors with extraction plugin configurations:

S no. Error Message Possible root cause
1. Invalid License. Could not be verified. Network connection failure.
RecoStar command is not valid.
License is either not installed or invalid.
The Tomcat server is not started.
2. Problem in verifying License Unable to connect with Ephesoft license server or some error occurred at Ephesoft license server side.
3. Unable to load Fpr.rsp file The RSP file used for processing is invalid.
4. Exception while reading from XML Unable to process the batch.xml file or the batch.xml file is invalid.
5. Image processing or XML updating failed Unable to update the batch.xml fiule.
6.  File has invalid extension File processed by the RecoStar OCR engine has an invalid extension.
7. Document type could not be found for page Invalid document is being used for processing.
8. Unable to parse the orientation tag in RecoStar xml file. The RecoStar xml file has an invalid value for the orientation tag.
9. Unable to rotate the file:according to the values specified in its xml The RecoStar xml file has an invalid value for rotation