Product Perspective

Ephesoft develops Intelligent Document Capture system using open source technologies. Ephesoft document capture and mailroom automation (aka dcma in Engineering) offers an easy way to process incoming documents such as paper, fax, email attachments a like.

Without using Ephesoft, manual workflow for mailroom document processing is very inefficient. Operators receive documents via mail, email, or fax. They organize documents based on the destination/business unit and scan them using high-speed scanners. The scanned documents are processed for validation, verification, and data entry. Ephesoft document capture and mailroom automation intends to automate the whole document processing workflow and requires operator interference only to handle exceptions.

The product is (1) web based but On-Premise[1], (2) built using open source technologies and (3) has no up-front license fee. The product is available in two editions. The community edition is available for free and has community-driven support. The enterprise edition has paid maintenance and has additional features via commercial plug-ins.

The product is designed for three different profiles of users:

  1. Data entry operators review and validate the scanned/imported documents. These users mainly use keyboard, mouse or keyboard shortcuts to have high efficiency.
  2. Supervisors do system level operations like reporting, setup, configuration, etc.
  3. Administrators configure the batch class and How Ephesoft interacts with Document Management Systems, Business Process Management Systems or Databases.

Functional Requirements

Ephesoft is built on top of a workflow engine. Each step in the workflow is called Plugin and independently responsible for one specific operation. A Plugin might be responsible of OCRing the page and another plugin might be responsible for exporting documents to a repository. Plugins are grouped into sub workflow containers which are called Modules. For example all plugin that are used to extract meta data from documents such as Free Form extraction or Zonal OCR/ICR, or table/Line item extraction plugins can be found in a module called Extraction.

Overall System Architecture

Below diagram shows the Overall Workflow and how Plugins and Modules are implemented. Workflow can be followed from left to right. Documents are imported by the plugins in the Import Module, each page is analyzed by various plugins in Page Processing Module, Document boundaries are identified by the plugins in the Document Assembler Module, Operators review the documents classification results (if necessary) in Document Review, Meta Data in extracted from Documents based on document type using the plugins in Extraction Module, Operators validate the extracted values in Document Validation and finally Documents are exported to their destinations.


Ephesoft Document Capture and Mailroom Automation workflow consists of an automatic modules and a manual modules.

Automatic Workflow

The automatic modules, shown in green, does not require any user interaction to execute plugins. Ephesoft Server automatically processes the documents as soon as they are available.

Manual Workflow

The manual modules, shown in blue, is used only when needed. If the document classification is not needed Ephesoft automatically skips this step in the workflow. Same is also valid for Document Validation. Both Document Review and Validation are designed for a very specific purposes and they provide unique features to help their functions. Review is designed for reviewing the classification results/exceptions. If Ephesoft is not sure about a document, it will ask users to correct or verify the document type in this User Interface. Validation is designed for reviewing extracted meta data fields. If the fields/data is not confidently captured or has a missing information, it will ask users to complete or verify the captured data.

Both Review and Validation also supports automatic and manual batch selection. In automatic batch selection, the system automatically selects the next available batch for processing. The user, based on his rights, can either review the batch or can both review and validate it. In manual batch selection, the user is shown a list of all the batches available for review and validation in the grid and he/she can select the batch (to work).