What is the Difference Between the Document Assembler and Advanced Document Assembler?

Applies to: Ephesoft Transact and above

When the Advanced DA Switch is enabled in the DOCUMENT_ASSEMBLER plugin, the plugin will run using the ADVANCED_DOCUMENT_ASSEMBLER algorithm. So what are the differences between the two algorithms?

Separation Method

The main difference between the two algorithms is their separation method:

  • Document Assembler Algorithm: Looks at the highest confidence value for each page. When it finds a “first page”, it starts a new document.
  • Advanced Document Assembler Algorithm:  Forward and reverse page-level look-aheads and look-behinds to all alternate values are applied to a proprietary algorithm. Decision making is based on every permutation of pages and alternate value information in the xml.

Note: Both algorithms use the same weighting factors and classification method to generate document classification confidence scores. This is as follows:

  • DA Rule first-middle-last page: 100
  • DA Rule first-page: 50
  • DA Rule middle-page: 25
  • DA Rule last-page: 50
  • DA Rule first-last page: 75
  • DA Rule first-middle page: 50
  • DA Rule middle-last page: 50

For more information on this topic, refer to the Document Assembler Plugin