Discover the synergy of Ephesoft Transact and Microsoft enterprise applications

Microsoft product offerings are so pervasive, it would be difficult to find an organization not using some component of their enterprise suite. Companies of all sizes and industries manage content, maintain documents and work through business processes in Microsoft applications. Let’s look at the benefits of using supervised machine learning capture technology with Microsoft products, Ephesoft’s OpenAPI library for integration with the Microsoft enterprise suite and third-party applications, and demonstrate a content capture, processing and routing workflow.

If you work in an office, odds are, you’re interacting with Microsoft at some point in your day. Think about some of the applications in their product suite – Dynamics as a Customer Relationship Management or CRM tool, Azure for cloud storage and infrastructure, SharePoint for document and records management, Flow for business process management and basic workflow automation and Outlook for email communication. All of these applications or services have content processes in common: usage, storage, maintenance and process control of organizational content. Examples are office documents and records, application forms, customer email correspondence, vendor orders, sales contracts and internal memos.

According to a recent study, 57% of Fortune 500 companies utilize Azure and 80% of those same companies employ SharePoint for content management and storage. And nearly 50,000 organizations use Outlook for email correspondence. So how can a smart capture tool touch all parts of this ubiquitous office work platform while supporting the common organizational goals of automation and workplace digitization?

If we can assume that the core of any business is its documents and the data those documents represent, then the Ephesoft’s Smart Capture® platform gives you access to the unstructured content that lies within them. Our supervised machine learning and analytics engine can help classify documents and extract data, minimizing manual data entry and giving you structured data from unstructured content. Whether you’re scanning sales contracts into your client repository in SharePoint or using classification Web Services in Flow, Ephesoft can address nearly any document automation need.

6 Steps in Processing Document Content

Ephesoft Smart Capture aims to reduce processing time, increase efficiency, and eliminate errors in document-centric processes. Digging into the actual technology behind Ephesoft Transact, let’s break down the method by which the application handles unstructured documents.

The first step in any capture workflow is the actual ingestion of the documents. In the case of Microsoft, we have out-of-the-box connectors for grabbing documents of all types from Outlook and SharePoint in batch. Document capture can be scheduled on a predetermined frequency or triggered in real-time as new document appears at the content source, such as from an end-user scanning in a file or from an email appearing in an inbox.

Next, we move on to image processing where basic image enhancements are applied to document pages and the document is rendered fully text searchable through an embedded OCR engine. Ephesoft supports a variety of text conversion tools including OmniPage, Recostar and Tessaract.

After the document has been broken down into individual pages and made fully text searchable, the Ephesoft’s Smart Capture® platform assembles the pages into single documents and categorizes the document according to full content analysis. We use unique classification algorithms that don’t require the creation of rigid document templates for highly accurate categorization.

Once the documents are sorted into their appropriate categories, Ephesoft Transact utilizes a combination of preconfigured rules-based logic and supervised machine learning to identify and extract key pieces of metadata. These pre-defined extraction methods include capture mechanisms such as key value pair extraction where the application searches for anchor values near a desired field, pattern matching utilizing regular expressions, fuzzy database lookups and tabular extraction.

Once the application has analyzed the content and completed data extraction step, the documents may be routed to a validation queue for verification or remediation of machine extracted index fields. Fields may be highlighted in a validation user interface (UI) based on the confidence value (or certainty of correct information) determined by the application. For example, if after this text analysis, the application’s output for the field invoice total falls below the designated confidences core, Ephesoft Transact’s built-in document processing workflow will trigger that document (and that specific field) to be flagged for review by an end-user.

The final step in the content processing workflow is exporting the content – represented by the sorted documents and their associated data files, stored as an XML or CSV – into their final location. In a Microsoft environment, this can involve routing a document into SharePoint online, Flow, Dynamics, or it can simply send the document to a hot folder for a downstream system to pick up and use.

Smart Capture in a Microsoft Workflow

There are two distinct stages of a document-centric workflow where a component of the Microsoft enterprise suite would interact with Ephesoft.
Upstream capture and document onboarding
Mid-flow processing

Let’s walk through an example workflow where Ephesoft acts as up-stream capture in a common document-centric process.

Example #1: Document Onboarding with Ephesoft

Let’s say a vendor sends your company an invoice. Given the user count statistic I shared a few minutes ago, it’s safe to assume that your organization uses Outlook, and the invoice is sent to an Accounts Payable inbox via email attachment.

Ephesoft Transact monitors that inbox for new, incoming messages, and automatically ingests the email and the attached document as a part of its initial capture workflow.

Then, Ephesoft Transact identifies the email attachment as an invoice through full content analysis of the document. Remember, rigid document templatization is not a factor with Ephesoft’s Smart Capture® product suite, so it doesn’t matter if all your vendors format their invoices slightly differently. We’ll be able to recognize that information regardless of the document layout. Based on the categorization of invoice, there are key data fields that need to be extracted to populate downstream systems and processes. Using a combination of rules-based capture definitions and supervised machine learning, the application identifies fields like invoice total, net payment terms, vendor name, and line item quantities.

Next, we need to make sure that the address listed for the vendor on that document is the same address we have on file in our CRM system, so we do a database lookup to Microsoft Dynamics. If the addresses do not match, the invoice can be flagged in a validation queue for end-user review. Or we can have Ephesoft push an address update call to the CRM to update that database.

Lastly, the invoice and its associated metadata is routed to SharePoint via our out-of-the-box export plugin. And all of this was accomplished completely transparent to the end-user. How many different Microsoft applications did our example touch without a document scanner, clerk, or employee having to key in a single field or manually relocate an electronic file?

If you’re one of the many thousands of companies using Azure for cloud infrastructure, all of this can be installed on the Microsoft cloud to mitigate the burden of server management for IT.

Example #2: Mid-flow Processing

Maybe you need to be able to classify or convert documents or extract metadata through code as a part of an existing workflow. Ephesoft’s OpenAPI provides a powerful document capture automation tool and OCR platform within the world of Microsoft.

Application interoperability through Web Services

Understanding Swagger or OpenAPI Specifications will give us a better base to explain our example.

Swagger, also referred to as OpenAPI Specification or OAS, defines a standard, language-agnostic interface to RESTful APIs. For the non-technical folks in this webinar, it is essentially the definition and description of an API, and this allows people – as well as robotics processes – to discover and understand the capabilities of that Web Service without needing to access the source code or product documentation. When a Web Service call is properly defined within Swagger, a consumer can understand and interact with the remote service with a minimal amount of implementation logic.

3 Benefits of OpenAPIs

Minimize IT burden
Enable plug-and-play capture workflow
Better support application interoperability

To start, this is a tremendous benefit for the IT and service teams working on implementing a project. It greatly minimizes the amount of custom coding or scripting any given multi-application solution requires. Moreover, this easy-to-use Web Service call enables true plug-and-play functionality with Ephesoft as a capture mechanism within a document processing workflow. In sum, it expands support for application interoperability, and with so many different systems and applications used in a Microsoft environment, the Ephesoft Swagger is highly valuable.

Mid-process capture with Ephesoft

In this use case, let’s imagine we’re working with an insurance company. Similar to the previous example, we’re going to start this process with an email received in Outlook and end the process with documents in SharePoint.

A client emails an insurance form or claim submission to their agent. The insurance company uses a Robotics Process Automation, or RPA, system to monitor incoming emails. The system grabs the email from the client along with the attached form and sends it to Ephesoft Transact for processing.

Ephesoft identifies the form and document types from the email and attachments, and based on that identification, automatically extracts key pieces of data such as name, claim number, dates, and so on. In this case, Ephesoft performs a signature check on the forms to make sure the claimant has filled out all necessary signature fields. Once Transact completes its capture workflow routine, the documents along with an XML file of the extracted data is picked back up by the RPA system.

The RPA tool then reviews the data that the Ephesoft platform extracted and makes a determination whether this document can be sent to SharePoint to be added to the client’s file, or if there is missing information like a signature or key fields were left blank, the system can send an email back to the claimant to let them know additional information is needed.

In this instance, Ephesoft Transact enters the document progression mid-way through the workflow, but is still providing a crucial element of capture automation to enable this touchless insurance claim process.