{"id":30147,"date":"2019-01-30T11:54:57","date_gmt":"2019-01-30T19:54:57","guid":{"rendered":"https:\/\/ephesoft.com\/docs\/?page_id=30147"},"modified":"2020-05-19T12:24:18","modified_gmt":"2020-05-19T19:24:18","slug":"etext-support-in-pdf","status":"publish","type":"docs","link":"https:\/\/ephesoft.com\/docs\/products\/transact\/release-notes\/release-notes-2019-1\/etext-support-in-pdf\/","title":{"rendered":"EText Support – Leveraging Existing Text Layer in PDF Documents"},"content":{"rendered":"

Introduction<\/a><\/p>\n

EText Functionality<\/a><\/p>\n

Page Process Module<\/a><\/p>\n

Learn File(s) \/ Test Classification \/ Test Extraction<\/a><\/p>\n

Extraction Configuration Screens<\/a><\/p>\n

Batch.xml File<\/a><\/p>\n

Export Module<\/a><\/p>\n

Web Services<\/a><\/p>\n

Use Cases<\/a><\/p>\n

EText Mode \u2013 Automatically<\/a><\/p>\n

EText Mode \u2013 Always<\/a><\/p>\n

EText Mode \u2013 Never<\/a><\/p>\n

<\/a>Introduction<\/strong><\/h2>\n

Ephesoft Transact fully leverages the text embedded in computer generated PDFs (also referred to by RecoStar as EText). This helps to boost the accuracy of extracted text and greatly reduce the effort required for extraction on all projects that include processing of electronically generated documents.<\/p>\n

Rather than being OCRed as images, the documents with EText layer are processed using a special mechanism, which helps to extract data directly. As per provided configuration, the EText support feature can be used:<\/p>\n