{"id":31913,"date":"2015-03-10T11:34:03","date_gmt":"2015-03-10T19:34:03","guid":{"rendered":"https:\/\/ephesoft.com\/docs\/2019-1-2\/batch-class-management\/add-new-document\/learning-documents\/"},"modified":"2022-08-18T14:30:10","modified_gmt":"2022-08-18T21:30:10","slug":"learning-documents","status":"publish","type":"docs","link":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/","title":{"rendered":"Training Transact for Document Classification"},"content":{"rendered":"

A well-formed set of HOCR xml files which are placed in a hierarchical structure such as: Batch Class > Document type > Page type are used for the purpose of registering few standard HOCR xml documents with Lucene search engine. This process is called learning because it is like feeding the xml files into Lucene\u2019s memory by creating Lucene indexes. HOCR files in batch instance are compared with these memorized indexes to find the best match and classify the pages. Note that document training is a one-time process. Training makes the classification process fast as no index needs to be generated at runtime to classify the documents.<\/p>\n

Steps to Train Document Classification<\/h2>\n

You can train document classification using single or multipage .tif, .tiff, or .pdf documents.<\/p>\n

    \n
  1. Create the document type.<\/li>\n
  2. Select the document that Transact needs to recognize for classification. You can either drag the sample file to the Upload Test Classification File(s)<\/strong> pane, or click the Upload Test Classification File(s)<\/strong> link.<\/li>\n<\/ol>\n

    Transact will process and learn the documents.<\/p>\n

    \"C:UsersgajendrayadavDesktopScreen<\/p>\n

    Note:<\/strong> If the uploaded document is a single page document, a single page will be copied to Application-Checklist_First_Page. If the uploaded document is multipage, the first and last page of the document will be copied in the Application-Checklist_First_Page and Application-Checklist_Last_Page, respectively, and all other pages of the document will be copied in Application-Checklist_Middle_Page.\u00a0Suppose the user has created the Application-Checklist document type in batch class BC1 and saved this document type. It creates the necessary folders where tiff files to be learned can be placed. In this case, the folder will be created under [Ephesoft_install_directory<\/em>]\/SharedFolders\/[Batch-Class-Id<\/em>]\/lucene\/[clasification-method-sample<\/em>] folder. The following three subfolders will be created in this case:<\/p>\n

      \n
    • Application-Checklist_First_Page<\/li>\n
    • Application-Checklist_Last_Page<\/li>\n
    • Application-Checklist_Middle_Page<\/li>\n<\/ul>\n

      Learn File(s)<\/h2>\n

      This feature learns the documents present in the folders of document type.<\/p>\n

      On learning, the following action happens:<\/p>\n

      1. Hocr files will be created in the folder \u201cEphesoft-install-dirSharedFoldersBatch-Class-Idlucene-search-clasification-sample\u201d for lucene learning.<\/p>\n

      2. Thumbnails will be created in the folder \u201cEphesoft-install-dirSharedFoldersBatch-Class-Id image-classification-sample\u201d for image classification.<\/p>\n

      3. Indexes will be created in the folder \u201cEphesoft-install-dirSharedFoldersBatch-Class-Id learn-index\u201d for index learning.<\/p>\n

      View Learn File(s)<\/h2>\n

      User can navigate using keyboard to see learn files result for different document types.<\/p>\n

      User can view learned files of a document type on the UI.<\/p>\n

      Select any document type and click on \u2018View Learn File(s)<\/strong>\u2019 button, following UI will be presented:<\/p>\n

      \"C:UsersgajendrayadavDesktopScreen<\/p>\n

      The Result page will have the following given columns:<\/p>\n\n\n\n\n\n\n\n
      Column Name<\/strong><\/td>\nType of value<\/strong><\/td>\nValue options<\/strong><\/td>\nDescription<\/strong><\/td>\n<\/tr>\n
      File Name<\/td>\nString<\/td>\nNA<\/td>\nIt represents the uploaded file name.<\/td>\n<\/tr>\n
      Page Type<\/td>\nString<\/td>\n\n
        \n
      • FIRST<\/li>\n<\/ul>\n
          \n
        • LAST<\/li>\n<\/ul>\n
            \n
          • MIDDLE<\/li>\n<\/ul>\n<\/td>\n
      It specifies this file is learned as first, last or middle page.<\/td>\n<\/tr>\n
      Image Classification<\/td>\nBoolean<\/td>\n\n
        \n
      • True<\/li>\n<\/ul>\n
          \n
        • False<\/li>\n<\/ul>\n<\/td>\n
      It specifies whether thumbnail created or not. It\u2019s value will be true If thumbnail created else false.<\/td>\n<\/tr>\n
      Lucene Learning<\/td>\nBoolean<\/td>\n\n
        \n
      • True<\/li>\n<\/ul>\n
          \n
        • False<\/li>\n<\/ul>\n<\/td>\n
      It specifies whether hocr files created or not. It\u2019s value will be true if hocr file created else false.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

      The Ephesoft software is now ready and has learned the document type of Application-Checklist.<\/p>\n

       <\/p>\n","protected":false},"featured_media":0,"parent":47115,"menu_order":5,"comment_status":"closed","ping_status":"open","template":"","doc_tag":[],"yoast_head":"\nTraining Transact for Document Classification | Ephesoft Docs<\/title>\n<meta name=\"robots\" content=\"noindex, follow\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Training Transact for Document Classification\" \/>\n<meta property=\"og:description\" content=\"A well-formed set of HOCR xml files which are placed in a hierarchical structure such as: Batch Class > Document […]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/\" \/>\n<meta property=\"og:site_name\" content=\"Ephesoft Docs\" \/>\n<meta property=\"article:modified_time\" content=\"2022-08-18T21:30:10+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/ephesoft.com\/docs\/wp-content\/uploads\/2015\/03\/word-image83.jpg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/\",\"url\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/\",\"name\":\"Training Transact for Document Classification | Ephesoft Docs\",\"isPartOf\":{\"@id\":\"https:\/\/ephesoft.com\/docs\/#website\"},\"datePublished\":\"2015-03-10T19:34:03+00:00\",\"dateModified\":\"2022-08-18T21:30:10+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/ephesoft.com\/docs\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Transact\",\"item\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Features and Functions\",\"item\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"Administrator Role and Features\",\"item\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/\"},{\"@type\":\"ListItem\",\"position\":5,\"name\":\"Batch Class Management\",\"item\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/\"},{\"@type\":\"ListItem\",\"position\":6,\"name\":\"Document Types\",\"item\":\"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/\"},{\"@type\":\"ListItem\",\"position\":7,\"name\":\"Training Transact for Document Classification\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/ephesoft.com\/docs\/#website\",\"url\":\"https:\/\/ephesoft.com\/docs\/\",\"name\":\"Ephesoft Docs\",\"description\":\"Intelligent Document Processing Made Easy\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/ephesoft.com\/docs\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Training Transact for Document Classification | Ephesoft Docs","robots":{"index":"noindex","follow":"follow"},"og_locale":"en_US","og_type":"article","og_title":"Training Transact for Document Classification","og_description":"A well-formed set of HOCR xml files which are placed in a hierarchical structure such as: Batch Class > Document […]","og_url":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/","og_site_name":"Ephesoft Docs","article_modified_time":"2022-08-18T21:30:10+00:00","og_image":[{"url":"https:\/\/ephesoft.com\/docs\/wp-content\/uploads\/2015\/03\/word-image83.jpg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/","url":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/","name":"Training Transact for Document Classification | Ephesoft Docs","isPartOf":{"@id":"https:\/\/ephesoft.com\/docs\/#website"},"datePublished":"2015-03-10T19:34:03+00:00","dateModified":"2022-08-18T21:30:10+00:00","breadcrumb":{"@id":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/learning-documents\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/ephesoft.com\/docs\/"},{"@type":"ListItem","position":2,"name":"Transact","item":"https:\/\/ephesoft.com\/docs\/products\/transact\/"},{"@type":"ListItem","position":3,"name":"Features and Functions","item":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/"},{"@type":"ListItem","position":4,"name":"Administrator Role and Features","item":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/"},{"@type":"ListItem","position":5,"name":"Batch Class Management","item":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/"},{"@type":"ListItem","position":6,"name":"Document Types","item":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/"},{"@type":"ListItem","position":7,"name":"Training Transact for Document Classification"}]},{"@type":"WebSite","@id":"https:\/\/ephesoft.com\/docs\/#website","url":"https:\/\/ephesoft.com\/docs\/","name":"Ephesoft Docs","description":"Intelligent Document Processing Made Easy","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/ephesoft.com\/docs\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"}]}},"comment_count":0,"_links":{"self":[{"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs\/31913"}],"collection":[{"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs"}],"about":[{"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/types\/docs"}],"replies":[{"embeddable":true,"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/comments?post=31913"}],"version-history":[{"count":4,"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs\/31913\/revisions"}],"predecessor-version":[{"id":51574,"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs\/31913\/revisions\/51574"}],"up":[{"embeddable":true,"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs\/47115"}],"prev":[{"title":"Importing and Exporting Document Types","link":"https:\/\/ephesoft.com\/docs\/products\/transact\/features-and-functions\/administrator\/batch-class-management\/document-types\/document-type-importexport-4050\/","href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/docs\/31893"}],"wp:attachment":[{"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/media?parent=31913"}],"wp:term":[{"taxonomy":"doc_tag","embeddable":true,"href":"https:\/\/ephesoft.com\/docs\/wp-json\/wp\/v2\/doc_tag?post=31913"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}