KB00027534: How to Create a Script to Remove Blank Pages

This article describes how to remove blank white pages from document files in all Ephesoft Transact versions. Blank pages can be removed from documents contained in batch instances when processing documents in the Ephesoft Transact workflow.

  • This capability is not enabled by default with Ephesoft Transact, but is available with a basic script that reads the HOCR.xml file associated with blank pages. This file is located at the following path:

<SharedFolders>\ephesoft-system-folder\<batchInstance>\<batchInstance>_PG<x>_HOCR.xml

  • To create a custom script for this purpose, the script must parse the HOCR.xml file and find the HocContent tags.
  • If these HocContent tags in the HOCR.xml file are empty, this means there is no text extracted from the page. Therefore, any such page is a blank page and can be removed.