{"id":31867,"date":"2015-03-09T12:51:47","date_gmt":"2015-03-09T20:51:47","guid":{"rendered":"https:\/\/ephesoft.com\/docs\/2019-1-2\/moduleplugin-configuration\/page-process-module\/search-classification-plugin-2\/"},"modified":"2022-03-09T11:55:10","modified_gmt":"2022-03-09T18:55:10","slug":"search-classification-plugin-2","status":"publish","type":"docs","link":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/moduleplugin-configuration\/page-process-module\/search-classification-plugin-2\/","title":{"rendered":"Search Classification Plugin"},"content":{"rendered":"

Available<\/strong>: on-premises, cloud<\/p>\n

Introduction<\/h2>\n

This document describes how to configure and use the Search Classification plugin. The plugin classifies documents in the Page Process<\/strong> module of the workflow using Lucene-based indexing. Classification is how Ephesoft Transact chooses or associates the document to the Document Type. This document applies to Ephesoft Transact 2019.1 and above.<\/p>\n

Configuring the Search Classification Plugin<\/h2>\n

Perform the following steps to configure the SEARCH_CLASSIFICATION plugin in the Page Process<\/strong> module. You must have administrator rights to complete these steps.<\/p>\n

    \n
  1. Launch Ephesoft Transact and navigate to Administrator <\/strong>> Batch Class Management<\/strong>. Enter login credentials when prompted.<\/li>\n
  2. Select an existing batch class and click Open<\/strong> or create a new batch class. You can also copy or import an existing batch class, then modify it to create a new batch class.
    \nThe following figure illustrates the SEARCH_CLASSIFICATION plugin in a typical batch class configuration.<\/li>\n<\/ol>\n

    <\/p>\n

    Navigation to SEARCH_CLASSIFICATION Plugin<\/em><\/span><\/p>\n

    The SEARCH_CLASSIFICATION plugin works independently of the MULTIDIMENSIONAL_CLASSIFICATION_PLUGIN<\/strong> in the Page Process<\/strong> module. Both plugins can be present in the module.<\/p>\n

    \u00a0 \u00a0 \u00a0 3. Select the SEARCH_CLASSIFICATION plugin to set up the configuration. The Plugin Configuration <\/strong>screen for the SEARCH_CLASSIFICATION<\/strong> plugin displays.<\/p>\n

    <\/p>\n

    SEARCH_CLASSIFICATION Plugin Configuration Screen<\/em><\/p>\n

    Configurable Properties<\/h2>\n

    The following table lists and defines the configurable properties for the Search Classification plugin:<\/p>\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
    Configurable Property<\/th>\nType of Value<\/th>\nValue Options<\/th>\nDescription<\/th>\n<\/tr>\n<\/thead>\n
    Lucene Valid Extensions<\/td>\nList of Values<\/td>\nxml<\/p>\n

    html<\/td>\n

    This field defines the valid extension of the input file and is applied when classifying document types for the specified file format.<\/td>\n<\/tr>\n
    Lucene Min Term Frequency<\/td>\nInteger<\/td>\nNA<\/td>\nThis field sets the frequency below which terms will be ignored in the source document.<\/td>\n<\/tr>\n
    Lucene Min Document Frequency<\/td>\nInteger<\/td>\nNA<\/td>\nThis field sets the frequency at which words are ignored. When a word does not occur in at least x amount of documents indicated in this field, it gets ignored.<\/td>\n<\/tr>\n
    Lucene Min Word Length<\/td>\nInteger<\/td>\nNA<\/td>\nThis field sets the minimum word length. Words smaller than this setting are ignored from the HOCR content.<\/td>\n<\/tr>\n
    Lucene Min Query Terms<\/td>\nInteger<\/td>\nNA<\/td>\nThis field sets the minimum number of query terms that will be included in any generated query.<\/td>\n<\/tr>\n
    Lucene Top Level Field<\/td>\nString<\/td>\nNA<\/td>\nThis property is used to configure the default field for query terms.<\/td>\n<\/tr>\n
    Lucene No Of Pages<\/td>\nInteger<\/td>\nNA<\/td>\nThis property specifies the number of documents to be returned in a query search.<\/td>\n<\/tr>\n
    Lucene Index Fields<\/td>\nList of Values<\/td>\ntitle<\/p>\n

    summary<\/td>\n

    This property is used as an index field for searching the document type using Lucene.<\/td>\n<\/tr>\n
    Lucene Stop Words<\/td>\nList of Values<\/td>\ntitle<\/p>\n

    name<\/td>\n

    This property sets the words to be ignored when classifying a document.<\/td>\n<\/tr>\n
    Search Classification Switch<\/td>\nList of Values<\/td>\nON<\/p>\n

    OFF<\/td>\n

    This property enables or disables the SEARCH_CLASSIFICATION plugin for the batch class.<\/td>\n<\/tr>\n
    Search Classification Max Results<\/td>\nInteger<\/td>\nNA<\/td>\nThis field defines the maximum number of alternate value results that will be generated in the batch.xml.<\/p>\n

    The default value for this field is 5 in Ephesoft Transact to control the overall size of the batch.xml file.<\/td>\n<\/tr>\n

    First Page Confidence Score Value<\/td>\nInteger<\/td>\nNA<\/td>\nThis property is used to update the confidence score based on the first page type.<\/td>\n<\/tr>\n
    Middle Page Confidence Score Value<\/td>\nInteger<\/td>\nNA<\/td>\nThis property is used to update the confidence score based on the middle page type.<\/td>\n<\/tr>\n
    Last Page Confidence Score Value<\/td>\nInteger<\/td>\nNA<\/td>\nThis property is used to update the confidence score based on the last page type.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

    4. Define the settings, then click Deploy<\/strong> to save and enable the changes.<\/p>\n

    Search Classification Execution Process<\/h2>\n

    This plugin operates in the Page Process<\/strong> module after all batch-level import processes are complete.<\/p>\n

    Ephesoft recommends that document learning is completed for the batch class prior to using this plugin. This plugin classifies incoming document images using Lucene-based indexing. This plugin functions in two stages when classifying documents:<\/p>\n