Transact

  1. Home
  2. Transact
  3. Features and Functions
  4. Administrator Role and Features
  5. Modules and Plugins
  6. Extraction Module
  7. RecoStar Extraction Plugin
  8. Beta: RecoStar HOCR for Linux

Beta: RecoStar HOCR for Linux

Introduction

Important: This document covers configuration of the RECOSTAR_HOCR plugin for Linux .

Transact for Linux 2020.1.05 introduces the RECOSTAR_HOCR plugin for Page Processing. The RECOSTAR_HOCR plugin improves OCR accuracy for non-English languages. It also provides more pre-OCR image processing flexibility for non-English languages.

This is the first part of a multi-phased approach to provide additional RecoStar-based plugins for ingestion, classification, extraction, and export. This will include, but may not be limited to the following:

  • Document import
  • Barcode extraction
  • Fixed form extraction
  • Native ICR/OMR configuration
  • Batch export

The RECOSTAR_HOCR plugin for Linux will remain in beta until it is certified for production or Ephesoft determines that it is feature complete based on feedback from our customers. The plugin is provided out-of-box and is optional to use. The RECOSTAR_HOCR plugin should not be used in place of the NUANCE_HOCR plugin unless a specific use case requires it.

Limitations

The RECOSTAR_HOCR plugin for Linux has the following known limitations:

  • No EText support
  • Performance is reduced by up to 30% compared to RECOSTAR_HOCR for Windows
  • The plugin has no barcode support and should remain OFF

Other RECOSTAR_HOCR plugin features that are available in the Windows version of the plugin are not yet supported in the Linux version. These include, but are not limited to the following:

  • Fixed form extraction
  • Barcode extraction
  • Native key-value snippet ICR extraction
  • WebServices import or export capabilities

Prerequisites

To configure and use the RECOSTAR_HOCR plugin, the following configurations must be in place:

  • You will need Ephesoft Transact version 2020.1.05 or higher installed.
  • You will need a batch class with a document type configured. For detailed steps, refer to Add New Document Type.
  • You will need to add the RECOSTAR_HOCR plugin to the Page Process module for the batch class. For more detailed steps, refer to Configuring Plugins.
  • Remove any other HOCR plugins from the batch class Page Process module.

Configure the RECOSTAR_HOCR Plugin

This section provides information on how to configure the RECOSTAR_HOCR plugin. This plugin only needs to be configured once per batch class.

To navigate to the plugin:

  1. From the Batch Class Management page, select and open your batch class.
  2. Go to Modules and select the Page Process module folder. The Plugin Configuration screen will appear.
  3. From the Plugin Configuration, locate the RECOSTAR_HOCR plugin in the Associated Plugins pane.
  4. Select the plugin and click the Add Selected icon to move it to the Selected Plugins pane.
  5. Click Deploy.
  6. Expand the Page Process module folder and select the RECOSTAR_HOCR plugin. The Plugin Configuration screen appears.

The following table lists the configurable properties for this plugin.

Configurable Property Options Descriptions
Image OCR Recostar Project File Name
  • Fpr.rsp
  • Fpr_MultiLanguage.rsp
  • Fpr_Barcode.rsp
This option is used to specify the project file name used for performing OCR.
Recostar Auto Rotate switch
  • ON
  • OFF
This property is used to auto-rotate the input images on the basis of orientation computed by the RecoStar project.
Recostar Switch
  • ON
  • OFF
Use this switch to enable or disable this plugin.
Barcode Switch
  • ON
  • OFF
Ensure this switch is set to OFF due to limitations in this beta. This property is used to read the barcode from the input images using the barcode-enabled RecoStar project FPR_Barcode.rsp file.
Recostar Deskew Switch
  • ON
  • OFF
This switch determines whether or not input images must be deskewed.
Recostar Font Switch
  • ON
  • OFF
The RecoStar Font Switch allows the user to detect any data that has been manually altered or added to the documents. By default, the Font Switch is set to OFF.
OCR/Country/Language Multiple countries and languages Type the country, countries, language, or languages that need to be supported during OCR operations. When adding multiple values, separate each value with a semicolon (;) and no space. The system will also populate a dropdown menu when you start typing a value in the field.

 

Was this article helpful to you? Yes No