Multidimensional Classification

Introduction

Overview

Multidimensional Classification is a plugin within the Page Process module. This patented plugin classifies documents across various dimensions and combines the score of each dimension. This results in improved accuracy and confidence in document learning. This document applies to Ephesoft Transact 2019.1 and above.

Comparing Multidimensional Classification and Search Classification

For the following reasons, Multidimensional Classification is a better approach to classifying documents, as compared to the older Search Classification method:

  • Search Classification was the original method of page classification with Ephesoft Transact. Although this older method continues to work well, it is a less accurate approach.
  • Multidimensional Classification contains newer algorithms that perform classification more accurately. This feature also improves document assembly.

Configuring the Multidimensional Classification Plugin

This section describes how to configure, enable, or disable the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN in the Page Process workflow module and in a batch class. Ensure that you have administrator rights to perform these tasks. These configurations are made from the Batch Class Management screen.

Notes:

  • This plugin is included by default in the BC1 batch class.
  • This plugin must be added manually to a batch class that is not based on the BC1 template batch class.
  • The default mode of this plugin is OFF.

The following figure summarizes the plugin configurations for the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN.

Plugin requirements for Multidimensional Classification

You have the option of leaving both plugins in the module, only if you disable the SEARCH_CLASSIFICATION plugin.

Note: If you plan to use the Multidimensional Classification plugin to create classification results from document learning and to perform document assembly, Ephesoft recommends disabling or removing the SEARCH_CLASSIFICATION plugin. Refer to Removing or Disabling the Search Classification Plugin.

Perform the following to configure the Multidimensional Classification plugin:

  1. Launch Ephesoft Transact. Navigate to Administrator > Batch Class Management. Enter login credentials when prompted.
  2. Select an existing batch class and click Open or create a new batch class. You can also copy or import an existing batch class, then modify it to create a new batch class.
  3. Within the batch class, navigate to Modules > Page Process. The Associated Plugins and Selected Plugins sections display in the Plugin Configuration screen.
  4. Add the plugin to the Page Process module:
    a. Select the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN from the Associated Plugins section of the Plugin Configuration screen and click the right-pointing arrow. This moves the plugin to the Selected Plugins section.

Associated Plugins for the Page Process module

b. Use the up and down arrows to rearrange the position of this plugin in relation to the other plugins of this module as needed. The following figure illustrates common plugins that are used in the Page Process module.

Selected Plugins — A typical sequence is shown above

c. Click Deploy to activate the plugin and save the changes made in step b and click Close to complete the configuration. The following message displays upon successful configuration:

Confirmation prompt

This plugin displays in the Page Process module and is ready to be activated.

Page Process module with MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN

5. Enable the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN for the batch class by selecting this plugin within the Page Process module. The Plugin Configuration screen displays on the right.

6. Review or configure this plugin by opening a batch class from the Batch Class Management screen, and navigate as shown below:

MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN

The following figure illustrates the Plugin Configuration screen for the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN.

Plugin Configuration for the MULTIDIMENSIONAL_CLASSFICATION_PLUGIN

7. Enable the plugin by setting the Multidimensional Classification Switch to ON. The Multidimensional Classification Switch contains the following two options in the drop- down list:

      • ON – Enables this plugin. When this plugin is enabled, Ephesoft Transact will use this plugin to classify documents.
      • OFF – Disables this plugin. When this plugin is disabled, Ephesoft Transact will not use this plugin to classify documents.

8. Set the Multidimensional Classification Max Results to your desired number. The Multidimensional Classification Max Results field controls how many alternate value elements will be generated in the batch.XML file produced within the workflow. The default setting for max results is 5 to keep the overall size of the batch.xml smaller. Adjust this setting as required. The size of the batch.xml could increase or decrease based on your setting.

a. Click Deploy to enable the changes.
b. Click Close to exit the Plugin Configuration screen.

Once configured, the Ephesoft Transact administrator must train at least one document for each document type with the Multidimensional Classification plugin. Refer to Document Learning with Multidimensional Classification.

Note: Refer to Removing or Disabling the Search Classification Plugin as recommended by Ephesoft after the Multidimensional Classification configuration.

Setting the Classification Type for the Document Assembler Plugin

This topic describes how to select Multidimensional Classification as the classification type in the DOCUMENT_ASSEMBLER plugin in the Document Assembly module. The MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN must first be configured and enabled.

Note: If this has not been completed, refer to the steps under Configuring the Multidimensional Classification Plugin in the Page Process Module.

Perform the following steps:

  1. Launch Ephesoft Transact and navigate to Administrator > Batch Class Management. Enter login credentials when prompted.
  2. Open the batch class in which the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN is enabled. Select the batch class and click Open.
  3. Navigate to the Document Assembly module and select the DOCUMENT_ASSEMBLER plugin.

Document Assembly Module

4. The Plugin Configuration screen for DOCUMENT_ASSEMBLER plugin displays.

DOCUMENT_ASSEMBLER Plugin Configuration, Release 4.5

5. Select MultidimensionalClassification from the DA Classification Type drop-down list.
6. Click Deploy. Confirmation windows display when both Apply and Deploy are clicked.
7. Click Close to return to the Batch Class Management screen.

Removing or Disabling the Search Classification Plugin

The user must have Administrator rights to perform this task.

This topic describes how to disable or remove the Search Classification plugin which the administrator should do when you configure and enable the Multidimensional Classification plugin.

Note: If you have any scripts that rely on results from the Search Classification plugin, leave it in the module.

Perform these steps to add, enable, disable, or remove the SEARCH_CLASSIFICATION plugin within the Page Process module for the batch class that uses MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN.

  1. Launch Ephesoft Transact and navigate to Administrator > Batch Class Management. Enter login credentials when prompted.
  2. Either select an existing batch class, and click Open, or create a new batch class.
  3. Navigate to the Page Process module. The Associated Plugins and Selected Plugins for the Page Process module display on the right.

SEARCH_CLASSIFICATION Plugin in the Page Process Module

4. Perform one of the following steps to remove or disable the Search Classification plugin:
a. Remove the SEARCH_CLASSIFICATION plugin by selecting it from the Selected Plugins field on the right. Use the left-facing arrow button to move this plugin to the Associated Plugins field on the left.
b. Disable the SEARCH_CLASSIFICATION plugin, select (highlight) this plugin. The Plugin Configuration screen displays. Select OFF from the dropdown menu.

Search Classification Switch

      c. Click Deploy to activate this change.

5. Retrain Ephesoft Transact with Learn Files for the document type, as applicable. When you disable the Search Classification plugin, you must retrain the batch class with the documents using the Multidimensional Classification plugin, which generates its own model for classifying the data.

Document Learning with Multidimensional Classification

The standard method of training a batch class for a document type applies to both Search Classification and Multidimensional classification.

The Multidimensional Classification mechanism works on supervised learning.

  • The plugin for Search Classification or Multidimensional Classification uses learning to classify the pages in the given batch. Such learning includes search classification.
  • The MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN takes a sample of search classification and updates it during the learning process.

Within an open batch class, use the Learn File(s) button in the Batch Class Management screen. When a document type is trained with at least one file, the plugin creates a new file with the following name in the Batch Class folder:

BC<ID>-dimensions
Example: BC8-dimensions

Click Learn File(s) to update the learning that occurs during classification.
The following two figures illustrate the Learn File(s) button from an open batch class from the Batch Class Management screen.

Learn Files in the Document Types screen for a sample batch class, Ephesoft Transact Release 4.1.1.0

Learn Files in the Document Types screen for a sample batch class, Ephesoft Transact Release 4.5

Multidimensional Classification and Machine Learning

The MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN feature is supported by machine learning for document types. In this case, if classification is done incorrectly during the initial process job, the plugin can learn the change that is required to correct the classification and ensure it performs classification correctly with ensuing batch processing jobs.

Testing Classification of a Document Type with Multidimensional Classification

This topic describes how to learn a new document to test for classification of a new document type. Ensure that the following prerequisites are complete before performing this task:

  • Also, to complete this procedure, you must have sample documents available in electronic format, PDF or TIFF, for each document type that is to be tested.

Perform the following steps to test classification of a document type, for the batch class with the Multidimensional Classification plugin enabled:

  1. Launch Ephesoft Transact and navigate to Administrator > Batch Class Management. Enter login credentials when prompted.
  2. Open the batch class in which the Multidimensional Classification plugin is enabled. Select the batch class, then click Open. The batch class opens with a list of document types.
  3. To learn samples for a document type, upload sample documents.
    a. For each document type, click Learn Files.
    b. Select the document type. Then, click the Upload Test Classification File(s) link to select and upload a test image file for the document type.

The following message displays and notifies you when the test file upload is complete. You can also drag and drop the sample image file to the Drag and Drop Files Here area below the Upload Test Classification File(s) link. The system provides the following confirmation message for a sample upload.

Success confirmation dialog

   Note: You can also drag and drop the sample image file to the Drag and Drop Files Here area below the Upload Test Classification File(s) link.
4. Navigate to the document type screen where you uploaded the test image file (in the previous step) and click Test Classification from the toolbar on top of the screen.
The Test Classification screen displays.

Test Classification screen

    5. Select an option from the Workflow drop-down list as described in the following summary:

Item Description
ON If the Workflow field is set to ON, then the Classification Types drop-down list is disabled. Test classification results are based on configurations within the batch class.

Test Classification dialog

OFF If the Workflow is set to OFF, then the test classification results are based on the selection you make from the Classification Types drop-down list.

 Classification Types drop-down menu options

The various classification types available are as follows:

  • Search Classification
  • Barcode Classification
  • Image Classification
  • Automatic Classification
  • Keyword Classification
  • Multidimensional Classification

6. Click Classify.
The Test Classification screen is updated with classification results as shown in the following image.

Test Classification screen with sample results

Additional information about Multidimensional Classification

Learning with Multidimensional Classification

Multidimensional classification can be described as “page classification learning combined with search classification learning.” This classification mechanism works on supervised learning.

The user begins this process by importing documents then moving to the Page Processing module and clicking Learn Files to update learning for page and search classification.

The plugin uses learning to classify the pages in the given batch, and simultaneously learns search classification. The plugin takes the sample of search classification and makes corresponding updates to the page-classification learning.

Phases of Multidimensional Classification

Click Learn Files to begin the learning phase. A single model file is kept per batch class. In the learning phase for a document, the system performs calculations and saves the model file. Each document page type (first page, middle page, and last page) saves multiple types of information.

During the batch execution phase, the pages in the batch are classified according to learning.

When performing auto-learning of the document type and indexes for search classification, Ephesoft Transact also updates the model file for multidimensional classification.

During the document assembly phase, this plugin works in the same way as search classification.

Conclusion

This document concludes how to configure and use the Multidimensional Classification plugin in Ephesoft Transact.

For additional information about configuring or using classification in Ephesoft Transact, refer to the following documents:

For additional information about batch class creation, setup, and configuration, refer to the following documents: