OCR Deja Vu: The Build vs. Buy Conversation Won’t Die

IDP and OCR: Delivering Complete Document Automation Solutions

It’s helpful to understand OCR’s extensive history to appreciate today’s intelligent document processing’s full capabilities. If you aren’t familiar with the term, Optical Character Recognition (OCR) technology has been around since the 1970s, originally intended as an aid for the visually impaired. It interprets printed characters and transforms them to readable text. Through the 80s and the 90s, corporations quickly found the value of scanning paper and recognizing the text through OCR software. 

Documents were now searchable and bits of data could be identified for search requirements and extraction for use in business systems. Today, OCR has moved up the food chain and fuels AI engines, business workflow and Robotic Process Automation with data. The OCR engines of the past (and present) that were industry stalwarts, like Recostar or Nuance, have been augmented and surpassed by intelligent cloud OCR services like Google Vision, Amazon Textract and Microsoft’s Computer Vision. These powerful cloud OCR services provide high accuracy and give amazing results on both machine and human-generated text. 

But here lies the problem. As fancy and accurate as these cloud OCR engines can get, the result they provide is the same – raw text with some extended information. The black box still gives you generic text output, with slightly higher accuracy. Many companies assume that is all there is to do...

Necessary Ingredients for Enterprise Technology

Having sold and delivered these intelligent document processing solutions, OCR, or the process of converting images to text, is about 10% of the solution. At a high level, an enterprise-class IDP solution needs more than basic technology. It needs what I call the OCR “Oreo Cookie”. It’s not just the creamy filling (OCR) that makes the cookie, you can’t package filling. It takes support on both sides (pre- and post-processing) to deliver the whole package.

So what’s missing from intelligent cloud OCR solutions? Let’s look at the pre- and post-processing steps. Our Head Architect for the Ephesoft platform shared a list of necessary items for the preparation of documents for an accurate OCR output of pre- and post-processing to ensure accuracy, validity and proper delivery. 

An example of pre- and post-OCR services to get to the end result

While the list is generalized, there are granular microservices required to perform individual components of the overall steps. The below example is for AP Invoice Processing, and the processes involved in interpreting invoice images.

Pre-OCR Processing Post-OCR Processing
Import Prediction AI
Normalize Interpretation
File Prep Table AI
Asset Generation Vendor AI
AI Image Prep Table Reconcile
Vendor ERP Data
Data Normalization
Queuing
UI (User Interface for Exceptions)
Export

The magic is in what wraps the OCR process, not OCR itself. Similar to an Oreo cookie. 

The Choice: Build or Buy? 

For decades, the CIO has been faced with the eternal question: Do I buy it or build it myself? In 2021, the age of Robotic Process Automation (RPA) and no-code/low-code platforms, there are way too many opportunities to go down the rabbit hole and create “apps” that are inefficient, don’t meet user expectations and fall short. To avoid any potential pitfalls, the modern-day CIO must ask these 10 questions:

  1. What are the desired results from an OCR or data extraction perspective that will mean success for the business? 
  2. Are my documents “born digital” or are they copies of copies of physical documents?
  3. Is my team capable of building and supporting the pre- and post-OCR/extraction process components?
  4. Does my team have expertise or experience in writing document processing code and/or applications?
  5. Do I have the clean and accurate data to build ML models to interpret the OCR results from <XYZ OCR engine>?
  6. Do I really want to be a software/RPA development house?
  7. Is 80% accuracy substantial in the overall business value it provides?
  8. Who will update my application and keep it current with subtle document changes and new documents that need to be interpreted, configured and understood?
  9. How will I manage app security and cloud OCR data?
  10. What is the fastest and most cost-effective path? 

The Ephesoft Difference

I wrote this post because we routinely get enveloped quickly in the build vs. buy conversation, especially with larger prospects and customers. There is always a group consultant or team that raises their hand internally and says: “That’s easy, we can build it with RPA and Textract (Google or similar solutions).” Don’t be fooled by the lure of fancy OCR promises and remember the Oreo cookie – a strong foundation will only benefit your digital transformation initiative.

Ephesoft has leveraged its 10 years of deep capture experience to build the next-generation intelligent document processing platform to drive productivity through document context. Combining the best of OCR, AI, machine learning and rules, we achieve unmatched accuracy and straight through document processing. Contact us today for a live demo and see how we can help you drive business results.


Are You Average or Best-In-Class?

Evaluating Your Accounts Payable Performance With Key Industry Metrics

Invoice processing is the same whether you are a 100 person company or a Fortune 500: arduous and painful. And, the fact remains that right around 50% of organizations worldwide still rely on a manual process to receive invoices, enter their data into an ERP, approve and finally pay. This fact doesn’t even include those companies who use legacy document automation systems with basic features, which promised efficiency but barely saves time with its limited functionality.

So why is that? The full procure-to-pay cycle is vast and complex, and many organizations who pause to take on the whole “beast”, don’t have the in-house expertise or believe that they just can’t afford an end-to-end solution, the project time or the resource draw.

The first and most important step is to get a snapshot of where you sit today. It’s easy to do that with some benchmarking around key metrics. Below is an initial set of recommended metrics, an average of the typical accounts payable (AP) department and a best-in-class target.

1. Invoice Processing Cost – Many accounting departments have no idea what it costs them to process a single invoice. AP Staff time/wages, late fees, missed early pay discounts and approver time/wages are basic contributors to the process cost, which can balloon with lost invoices and poorly executed processing.
Average Cost per Invoice*: $11
Best-in-Class AP Departments: $2.20

2. Time to Process an Invoice – Just how long, from the time of receipt, does it take you to complete the cycle and pay your vendors? The two main contributors for those without an invoice automation solution are physical invoices that need to be scanned, approvals and the processing of exceptions.
Average Time to Process an Invoice: 10 days
Best-in-Class AP Departments: 2 days

3. Touchless Invoice Processing – The Holy Grail of invoice processing is a touchless process or straight-through processing. That magical invoice that gets processed and placed into the ERP with your data automatically entered and queued for payment. The key is that this happens with minimal to no accounts payable staff intervention.
Average Straight Through Invoice Processing Rate: 30.4%
Best-in-Class AP Departments: 90%

4. Invoice Exception Processing – Exceptions are the painful piece of the AP invoice process, and typically consume the majority of staff time. Wrong totals, missing items and missing vendor or approver information can lead to delays and missed opportunity for time-savings.
Average Exception Rate for Invoices: 25%
Best-in-Class Exception Rate: 10%

How did you rank? Average or Best-in-Class? Here at Ephesoft, accounts payable automation is our most widely used reason for implementing automation around the world. We focus on making the manual invoice processing workflow efficient and accessible: invoice ingestion, data entry and routing. Document and data capture and extraction is an easy win for any AP department. It’s a great first step to improving your accounts payable processing and freeing your staff for more important tasks. Contact us today and ask about Semantik Invoice to leave average behind.

Take the Accounts Payable Health Check! Use this AP benchmark tool to determine where you fit on the invoice processing spectrum compared to others in your industry.

*Statistics courtesy of Ardent Partners Accounts Payable in 2021 report


Enabling the Citizen Developer - Connectors

Time-to-Value: The Demand

If there is one thing we have learned here at Ephesoft, over the years prospects and customers have grown more and more intolerant of long deployment projects and high professional services bills. The rise of applications that come preconfigured, or immediately provide value with minimal configuration, are expected and now considered the norm.

New Developments in Intelligent Document Processing and RPA

In the document capture space, now called intelligent document processing (IDP), there have always been several barriers to a rapid return on investment:

  • Infrastructure deployment – The setup of servers, cloud instances or nodes customized for scale and specific customer’s use cases can be long and drawn out.
  • Configuration Services – Legacy providers have “out-of-the-box” static configurations that get mediocre results at best and require templates and extensive tweaking to provide less than adequate results.
  • Integration – The complexities of having a process work hand-in-hand with a repository or system of record, typically requires some custom work or code.
  • Periodic Updates – The flow of documents into an organization are dynamic and ever-changing. Vendors change their invoice format, forms are revised and new types popup constantly.

These factors have slowly been remediated over the past five years or so, mostly due to the entrance into the market of robotic process automation (RPA) and the redefining of what constitutes a true document processing solution. Cloud or simple desktop installations removed the need for extensive infrastructure. Applications became targeted at the “Citizen Developer” and anyone could build a flow, supposedly. Integrations became handled by a screen scraping army of bots. And the advent of artificial intelligence (AI) was the savior in regard to the endless stream of tweaks required to keep the system at peak accuracy. Or so we all thought.

Reality Hits Home

Intelligent document capture is hard. Processing documents that are received by an organization can be summed up by the 80/20 rule. It’s easy to get 80% accuracy in data extraction. Everyone can do it. That is the nature of the beast. But the real value or the secret sauce is getting that elusive 20% that takes up 80% of their time. Legacy capture solutions on the market just can’t get there and we see over and over that prospects struggle to keep even 80% accuracy alive.

The Solution

Over the past year at Ephesoft, we have been focusing on our next-generation product, or what I like to call the “20% killer.” Semantik is our cloud-based document AI platform that has been built to take customers and future customers to the next level. We focused on using AI to tackle the hard problems, not the easy ones. Turn it on, do some basic configuration, and be up and running almost immediately with results. Semantik Invoice was built on hundreds of thousands of data samples along with input from our customers, partners and employees. It provides unmatched accuracy out-of-the-box in competitive tests and can fit the needs of large and small customers alike.

The Next Phase:  Integration – The Uber Plug

Pundits estimate that over 60% of all business processes include documents or records. Think about all the flows within your organization that include a PDF, the scanning of paper, images from phones, etc. Invoices, sales orders, paper mail (!), forms, NDAs, forms and on and on. There are hundreds of these micro-processes that go on every day, killing productivity and preventing employees from performing at higher level, higher-value tasks. What if those employees were empowered to simply tap into an automation service for documents? What if they could use a familiar application to build their own intelligent document process?

Today, at Ephesoft, we are launching a new series of connectors to empower Citizen Developers with a document automation toolset to free up time and improve their productivity levels with a simple point and click (no code) setup. The connectors also provide unmatched integration capabilities and broaden the opportunity for productivity improvements across all enterprise applications. Here are our current connector toolsets available:

  • Semantik for Invoices – Microsoft Power Platform – This connector was built and certified through Microsoft and provides any business flow extracted invoice data. As a certified connector, it is available in the connector library for Power Automate, Azure Logic Apps and Power Apps. You can see the profile here: Ephesoft Semantik Connector for Power Automate.
  • Semantik for Invoices – Workato Shared Connector – Leverage our shared connector through any recipe using Workato. Contact us for free trial access to the Semantik for Workato Connector.
  • Ephesoft Transact Connectors for Power Automate and Workato – Visit our community connectors available for Ephesoft Transact within our Ephesoft Labs GitHub Repository.
  • Ephesoft to Catalytic allows you to easily add no-code workflows and automations to analyze and process extracted data.

For more information on our integration platform and connectors, contact us today.


Introducing Ephesoft Labs Community on GitHub

Today, we are launching a new and exciting chapter to the Ephesoft Story: Ephesoft Labs. Before I give you an overview, I’ll provide a little background on the genesis of the offering. It started with several discussions, mostly around maximizing the value of our products to our customers. What can we do to drive maximum automation, efficiency and productivity broadly and with low incremental cost? 

We began examining what we believe is our most valuable asset: over a decade and a half of deep capture experience and saved professional services and pre-sales projects. What would be the value to our ecosystem of customers and partners if this was available as an online repository? The result is Ephesoft Labs. 

GitHub repositories of just about every document workflow, script, process and configuration from the leading edge of capture, all available and open source at no cost to our customers and business partners. Built on an open source framework, the repository will be accessible to all members of the Ephesoft community to join, use and contribute (Phase I is available to selected partners and customers through the beta period).  

Here’s a quick tour and some highlights:

Our Technology Partners (Available publicly August 25, 2020)

We have some great technology partners that we have worked with for years and we have built integrations, modules and scripts into their systems or vice versa. Those custom-built integrations are now available in Ephesoft Labs. Here are just a few examples:

  • Nintex Extension – This provides a simple and powerful extension that allows the import of an Ephesoft Workflow tool kit into the Nintex Workflow Designer.
  • Microsoft Power Automate – Microsoft’s primary workflow and integration system, Power Automate, can now be enabled with Ephesoft’s new web-based acquisition platform, Semantik. Semantik currently provides a state-of-the-art AI model for invoice data extraction.
  • Blue Prism VBO – Customers and partners can now download a Visual Business Object (VBO) for a design canvas integration in Blue Prism.

This is just a sampling of what’s available and our technology partners will be a big focus, enabling customers to integrate just about any system by leveraging and adding into the Ephesoft Labs library.

Document Types*

At the core of Ephesoft’s technology is the ability to classify and extract data from different document types. Over time, we have amassed a large library of all different common business document types. A common request was to provide a library of documents that would allow customers to immediately glean value from the product, get to production fast and achieve a quick return on investment. In the repository, we now house common business document types, with the following examples:

  • Tax documents – W2, 1040, W-4, W-9 and others
  • Mortgage/Loan Documents – Loan estimate, closing disclosure, etc.
  • Accounting documents – Invoices, bills of lading, etc.

Integrations*

The flexible nature of Ephesoft’s platform allows for simple integration to a wide variety of backend systems. Our full repository of custom-built integrations, including SAP and Azure, are now shared, along with source code for implementation and/or customization. With many more being prepped for addition to the Ephesoft Labs repository, this flexibility will provide customers with added use cases for Ephesoft’s products.

Automated Extraction Solution*

This is a custom-built solution built internally for processing invoices, and extracting their data.  It was built to learn, and as users process documents, it builds rules based on confidence score for specific vendors and learns to identify them, and associated data.

Much, Much More

In addition to the above, the repository includes a long list of assets, from basic scripts to enhanced solutions for niche use cases like redaction, auto-extraction and advanced machine learning. The repository will be open for external contributions as well, allowing partners and customers to contribute value and extend the value of the offering.

Ephesoft Labs is Phase I of Ephesoft’s vision for an extensive network of powerful technical resources that will be unmatched in the intelligent document processing (IDP) market. Follow us for our next phase, and updates.

*Follow us for public availability or contact your Ephesoft Sales Representative.


The Elastic Enterprise: Amplify Productivity in 2020 and Beyond

The Elastic Enterprise is a new concept we’ve established to describe the new requirement for surviving in the current and post COVID-19 Era. At Ephesoft, we specialize in creating automation within organizations and driving maximum productivity, helping customers reduce their human touch on business processes and drive amazing results with our platform.

With the current state of the world, work as we know it is going to change and it is up to us to adapt and overcome in this new global environment. We have to enable our people to work and succeed under new circumstances. If we are going to survive as an organization, every aspect of our business needs to go through a transformation. But how do we approach this uncertainty, and how does it translate to a strategy we can implement as a business?

Black Swan Events

If you aren’t familiar with the term “black swan event,” it is a phrase commonly used in the world of finance and is an extremely negative event or occurrence that is impossible or difficult to predict. In other words, black swan events are events that are unexpected and unknowable. These events seem to be more and more common, just look at our past history: floods, fires, COVID, cyberattacks, political disruption, financial crisis, Brexit, and many others over the years.

That was the past though, right? Yes, but we must be prepared and equipped to handle future black swan events, even if we can’t predict them. Just as the human race needs to flex, adjust and adapt in order to overcome and survive, businesses need to be elastic to prevail, too.

What is Elasticity in a business environment?

First, let’s look at the definition of elasticity: the ability to return to normal shape. The ability to change and adapt. I think we can all agree this is a requirement for business in the new normal, from our business operations to IT to how we work with customers, partners and employees.

The Elastic Enterprise is a strategic mindset and an overarching theme that gives us the ability to adapt and change dynamically to external market and world forces. It provides for a constantly mobile and transient workforce that can quickly migrate and provides all the services and applications for users to be productive and complete their work. 

Its main tenets are:

  • To provide infrastructure and services that are scalable and available to employees, customers and partners at all time
  • To create a foundation for extending automation to where work is being done
  • It provides the environment for new apps and next-generation intelligence to thrive.
  • And, it can provide all this anywhere, any time

Organizations around the world are figuring out how to become more elastic in their strategy. In a recent survey by IDC, they revealed these top priorities from CIOs within the pandemic to build a strategy and foundation for the new normal:

  • Strengthen software capabilities for digital innovation at scale
  • Creating digital culture
  • Improve efficiency and reduce costs to optimize operations
  • Create a new remote office and collaboration system

Challenges to Building a New Model

What becomes clear to building the Elastic Enterprise to maximized productivity is the following: proving context to remote workers, fully adopting the cloud, groundwork for next-gen apps and truly work from anywhere. One factor, which contributes to adding context is the “watercooler effect.” An office is a place for collaboration and socialization and we just didn’t know how important it all was until it was removed from our daily grind. With a large percentage of users now remote, that source of discussion and knowledge was removed.

With not many options, virtual meetings become our way of interacting with coworkers, partners and customers. But is it really a replacement? Psychologists and social researchers are investigating this idea. They have found that there are quite a few issues with remote meetings. First, focusing on the meeting is difficult with all your applications available. Emails come in, chats are started and work is available to do. In addition, the “mirror effect,” or the habit of constantly checking your expression and adjusting can consume some users. The cognitive loading of those, combined with our brains struggling with reading normal body language or signals is taxing on our systems and mentally wearing us down. 

I work remotely and love to get back to our office headquarters. I get so much more out of those face-to-face conversations and can get things done quickly with focus. People leverage other people and their tribal knowledge. It adds context and provides a basis for problems. Solving and being productive. It’s this living contextual library, removed and now absent with new remote work, that many have struggled to overcome.

Are social and collaboration apps the answer? It’s easy to take things out of context in chat. What could be done with a 5-minute live conversation now becomes an hour-long string of chats that are missed on the other side? In addition, not all workers are tech-savvy and may lack the skills to completely get the most benefit from these apps. With email duplicating conversations and multiple chat platforms, conversations and meaning can be lost.

Context is King

The context theme will arise over and over as a key theme to ensuring maximum productivity in a dynamic landscape with remote workers. To truly be effective, applications will require “contextual extensions” that can provide instant reference and context to a discussion. For example, an extension on a Slack accounting conversation can add documents, data and information, which helps to add context and depth to a conversation, augmenting capability and deepening the ability to be productive in the moment.

Tapping into cloud architecture and solutions will help build the elastic enterprise. It seems like we have been talking about the move to the cloud forever, but it is still a hot topic with the 2020 cloud spend predicted to be over $231 billion. There are still organizations that won’t move for perceived security risks or it just hasn’t been high on the priority list.

Cloud is a Top Priority for CIOs

COVID has changed all that for now and the cloud is top on the priority list according to this IDC survey of CIOs. The cloud has become more of a priority to enable of the elastic enterprise.

Cloud allows for the distribution of services to your users and pervasive access. A combination of SaaS offerings, private cloud and public cloud can give organizations a hybrid approach to providing extended automation and driving productivity to all users, regardless of location or device. In general, the cloud allows for scale-up and scale-down operations to conserve costs and handle unknown fluctuations in volume. It provides always-on, dial tone availability to users globally. Many organizations have quickly realized it can provide the horsepower from a processing perspective to power machine learning and AI efforts. Finally, the lack of physical locations and assets to manage eliminate the rigidity that was a blocker for elasticity.

Today’s Remote Toolkit

As humans, we have adapted to remote work during this crisis but our applications have not. With a focus on enabling high levels of productivity, we as a community have tried to take our current corporate toolkit, and make it available for all to leverage. Content services and file sync apps allow for remote access and sharing. Collaboration tools and mobile capture provide a simple way to onramp files, and most workflow tools can be used to maintain continuity with the home base. But what about robotic process automation (RPA). And, where is the context and AI?

With many of the RPA implementations being attended processes that require desktop access and user engagement at some point, leaders need to re-evaluate RPA strategies and digital transformation plans. How can they move to unattended processing and eliminate the “robot helper” mindset or human in the loop? How can they create a digital watercooler for digital workers to glean context and reduce human input?

There must be a move to using AI and machine learning tools. According to Gartner, “AI augmentation of processes generates $2.9 trillion in business value and can recover 6.2 billion hours of worker productivity.” It’s no mystery that AI is required to make the next leap and drive new levels of productivity and automation in our applications. By augmenting our automation and processes, we can drive unseen levels of productivity and save billions of hours that can be spent on more important tasks.

Barriers to AI

While there are strong barriers to implementation, the elastic enterprise can remove most of them. With the elastic enterprise comes a new level of transformed business that can take advantage of new infrastructure to drive AI initiatives. As the move to the cloud continues, most cloud vendors have strong AI infrastructure and toolsets that are readily available and can provide a seamless integration point for application to take advantage of machine learning and AI. Finally, vendors are taking a pre-built solution focus that will give companies an easy starting point or provide solutions that can provide value immediately.

Tomorrow’s Toolkit

Soon, our future tools will provide enriched data around and in applications to drive context and provide users with the information they need now and in the future. Knowledge graphs will automatically provide connected and related data, insuring context is front and center, and collect true enterprise process knowledge. This understanding through artificial intelligence will drive augmented automation across the elastic enterprise and new levels of productivity.

Will office headquarters continue to exist? Will it become obsolete? In some conversations with commercial realtors, they see organizations not shrinking space, but reorganizing and repurposing for not only changing staff sizes and remote workers but also for transient workers. Those that come in here and there for meetings and to catch up with coworkers, but only stay for a day or part of one. With higher percentages working from home, the office transformation will be focused on hoteling and meeting spaces.

To accommodate this ebb and flow of workers, that elastic foundation needs context and automation, making it available in the office, at home and anywhere the user decides to park for the day. This consistency of service will drive productivity and remove barriers to efficiency.

Digital Workers and the Digital Office

Do robots need their own office? The current crisis has brought to light the fact that digital workers are immune to the virus and are not subject to ebb and flow. They can remain up, and ready to work, especially if they are now converted to unattended processors. The obvious place for this office is a virtual carve out in the cloud or with an RPA SaaS provider. But are they ready to be autonomous? Do they have the context and AI to run unattended in their own environment?

This core set of principles and subsequent questions, hopefully, makes sense and you can see the foundation that needs to be built. How will you make your enterprise elastic to enable productive, remote work anywhere? With an elastic infrastructure in place, organizations can revisit and refine their strategy to achieve high levels of autonomy and productivity. 

Providing context to your applications, especially RPA and workflow, can reduce human interaction by providing the data required to make decisions without human help to your digital workforce and applications. 

How do we climb the barrier? The solution to this problem is an overarching theme called Context Driven Productivity and it starts with becoming an Elastic Enterprise. Rigid organizations will die – we see it in the news every day. A focus on people, remote work, ensuring productivity through new applications and a focus on providing automation will glean amazing benefits. Finally, context will fuel the next wave of true autonomy as we give both human and digital users and our applications the context they need to make decisions, reduce human-in-the-loop requirements and take us to the top level of the autonomous journey.

Register to watch the webinar on Ephesoft's BrightTALK Channel here.


Incremental Automation: Opportunity for Efficiency and Productivity Improvements

Let’s face it: the document problem just isn’t going to go away this year. It didn’t get resolved this year or the year before, or the decade before. As we enter this new year, organizations will still struggle with documents, but with added pain: the addition of the digital worker.

The popularity and momentum of Robotic Process Automation (RPA) adds a new layer of complexity to the problem, and organizations with a hybrid human-digital workforce require flexible, accurate document intelligence solutions that can serve both types of workers.

Process automation applications, like RPA, create a new category of opportunities for efficiency and productivity improvement: incremental automation. When we examine any document process, each touch, action or review by a worker is lost time, a lapse in productivity and a hit to an organization’s overall efficiency. If we can methodically find the document processing chokepoints for both physical and digital workers, and eliminate them one by one, the sum of those incremental improvements can be massive. How do you eat a whale? One bite at a time.

How can we find these opportunities for incremental process automation? Here are some tips on identifying red flags in both your physical and digital processes.

High Volume Document Flow – Every department in an organization receives documents in some way, shape or form during the business day. Whether these documents are physical (mailroom) or digital (email, uploads, etc.), areas of high document flow are always excellent targets for automation. Opportunity: Documents can be classified and routed automatically to the correct destination.

Manual Data Entry – Anywhere in the organization where document data is manually entered into a system of record or document repository is a prime target. Even the smallest of opportunities can have a massive impact when spread out throughout multiple departments. Take the administrator who spends an hour every day keying information from email attachments. If that is being done by 5, 10 or 20 workers in an organization, the hours – and inefficiency – add up. Opportunity: Data can be auto-extracted and placed in any system.

Manual Sorting/Splitting – Sorting and splitting documents is a massive time waster. Take for example a PDF that houses three documents. The time and effort to split, save and rename even at low volume can consume precious hours. Opportunity: Documents can be automatically split, named and saved.

Data/Document Validation – As documents and their associated data flow in, it can take specialized workers to ensure proper data, and validate that it is correct. Errors that pass through to end-systems can be costly. Opportunity: Automated validation and exceptions processing can focus users on only those documents that need attention.

These are just a few ways to identify those physical and digital processes that are key targets for incremental automation. When will you automate?

For more reading:

Incremental Document Automation with APIs

Pervasive Capture: The New Age of Efficiency

OCR for RPA: Document-Aware Robots


5 Key Areas for Enterprises to Accelerate Remote Work

It would be a bit obvious and general to say that COVID-19 has changed the way we function as a society. There is no one, or nothing, that has not been affected by this global crisis and it will change the way we look at business and operations now and for the long-term future. As I work from home, abstain from my normal travel and make daily operational observations in my dealings with colleagues, partners, prospects and customers, it is clear that there are several focus areas where organizations will need to survive in the current and post-COVID-19 world. 

I have a unique position in that I interact with many different industries, business people and technology, which gives me a bird’s-eye view of where the pain exists and the choke points that were created. Let’s look at five key problem areas that organizations will need to conquer in order to adapt and overcome future crises:

Enabling Remote Work

The first and most obvious is the ability to conduct 100% remote work and keep the business engine humming from a distance. Remote access, VPN, meeting and collaboration apps and many other remote-enabling technologies will need to be solidified and procured for workers that are typically “in-office” personnel. In addition, CIOs will be taking a critical look at legacy applications, client/server architectures and non-cloud ready apps that have hindered remote work. In 2020 and beyond, a web browser should be all you need to get any job done. Paper will have to be eliminated and the outsourcing of mail processing will be a necessary requirement. According to the IDC COVID-19 Tech Index Survey, cloud environments will be a top priority for enabling remote work.

Scalability and Bursting

Certain industries will have to ensure they can handle inordinate volumes of data and documents. Instant scalability and the ability to handle large bursts in volume are not optional if organizations want to quickly adapt without bottlenecks due to processing power.  We’ve seen this in healthcare and government with the current crisis in screening centers that have 5-hour waits and programs that are attempting to process applications for government subsidies but are overwhelmed and causing large wait times. A move to the cloud or hybrid applications that can easily auto-scale will be the only option for these industries, including healthcare.

Go Mobile

Individuals should be able to do just about anything from a mobile device. New account openings, submitting documents, employment screening and any other actions need to be mobile-enabled. Mobile apps that act as onramps for existing systems. The onramping processes need to be pervasive and allow the acquisition of any type of document, data, image and content. For example, banks can use mobile apps for a new account opening, identification submission or other identification data type for loan processing. Automating remote processes like these will not only expedite the process, but it can cut costs and improve accuracy.

Context is King

Without the office environment and a detached workforce, individuals are isolated and often lack the necessary knowledge to make decisions. Employees and managers can benefit from more curated, enriched data to reach their maximum productivity levels. An application that provides that extended data and streamlines process and work will be in high demand and become a requirement. Adding context to data can offer workers a comprehensive picture of what the data means, allowing for faster and better decision-making with deeper insights. 

Point Solutions

A growing number of organizations are eliminating existing key business flow problems, while also solving new ones that have popped up from the “work from home” mandate. Removing these blockades will be a primary focus coupled with improving automation in high-volume areas. Examples include end-to-end automation for processes like Accounts Payable invoice processing, new employee onboarding, streamlining contracts or insurance claims and any other manual data entry.

Ephesoft solutions relieve many of these pain points natively and can help your organization survive and prosper in these new and unprecedented times. Contact us today to find out more about:

  • Ephesoft Transact and Semantik Platforms – enablers or remote document and data acquisition and enrichment
  • Ephesoft Transact Cloud HyperExtender – an auto-scaling, hybrid cloud service for high-volume and bursting process requirements
  • Ephesoft Mobile – a mobile content capture SDK for document acquisition on-the-go.
  • Semantik Platform – a context driven productivity, cloud-based platform to provide context, accelerate processes and enhance productivity for knowledge workers 
  • Semantik Invoice –  a new cloud-based point solution for accounts payable automation 

The Importance of Contextual Data in Your Digital Transformation Journey

Introduction to Context Driven Productivity and Ephesoft Semantik 

Today, I want to share with you some exciting new developments you may have seen in the media. This month, Ephesoft launched a new concept along with a beta of the first phase of a revolutionary product. We will introduce Context Driven Productivity and our new web-based, contextual productivity platform, Semantik.

To set the stage, it is core and central to understand what we mean by contextual data. The definition of context states the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed. How can we, as users, be more productive with contextual data?

We have all seen the statistics on the data growth problem in the enterprise, along with the impact it is having in many areas. CIOs and organizations understand that issue. But where does context come into play?

The Missing Piece: Context

In our ten years in providing document centered automation solutions, we noticed something.  Time and time again, users had just a piece of the puzzle. With data missing, they just didn’t have the context to get their jobs done effectively. How many tabs do you have open on your computer right now? Think about how many times per day you app or tab switch to get your job done. We have just accepted the fact that flipping from tab to tab, opening app to app, and copying data from here to there is just the way we do business.

I call it the “gathering.” Go to or use one system, get a data set or find a bit of information, then go to the next system, and use that data to find more data. And, on and on across systems. We as humans have this method on how to gather data for context wrapped in our heads as experience, and the application of this experience is complex. When organizations try and transfer this to bots or systems, it creates a whole new problem and requires constant exception processing. Too much time is wasted on these gathering efforts.

This problem is not isolated. Look across the organization and you will find the contextual data problem is blocking productivity across the organization: in accounting with invoices, in human resources with onboarding, in sales with order processing and in legal with contracts. This broad inefficiency when it comes to processes and context is pervasive in most organizations large and small.

It is this contextual problem that is one of the core reasons for broader failures. Without context, you are limited in what you can achieve and hinder your automation goals. There are some fairly disturbing numbers put out by McKinsey on the failure rates of both digital transformation projects and RPA projects. Automating a bad process or trying to transform a large department where users and systems don’t have all the data they need just doesn’t set you up for success.  It replicates what you have today, and moves it from a physical broken process to one that is now digital. There has to be a better way.

Examining the Problem

Is context really that critical? Is it worth the effort? In our processes, if we just take into account the flat, two-dimensional data that we have in hand, it not only affects our productivity but also will have us make decisions without the full 360-degree view. For example, let’s look at the hiring process. If you are involved with this process at any level, you know it can a massive time-consumer. Day-to-day, as you look at resumes, and think of the time you spend on interviews, only to find out later there are hidden issues or factors that rule out candidates, it is frustrating.

I receive a resume that looks fantastic. This individual checks all the boxes, but if all I have is this flat data and I am constrained to just the resume information, I lack enriched data from other sources that sit outside one document. I don’t know that her resume doesn’t match LinkedIn, that she is a frequent flyer when it comes to applying to our company, that we are bound by a hidden paragraph in our partnership agreement and we cannot hire her. She also has a legal history, her father is chairman of the board of our largest client, and that her passport has expired. This is for an international travel position! That context is all missing today.

If we examine the levels of enterprise autonomy and the journey to the highest level of automation, we’ve built a staged progression. Organizations and their different departments can be at any stage of the digital transformation journey, some doing some basic task and process automation, and others at advanced levels. But one thing is constant: there is a wall that prevents reaching higher levels if you lack context, and don’t have the enriched data and understanding to truly begin to learn all the data required for a task or process. This wall prevents access to knowledge for humans, digital workers and systems so they can make decisions in real-time and achieve the highest levels of productivity. If we can tear down this wall, it can free processes from human involvement, allowing workers to focus on higher-value tasks for the organization.

A New Discipline: Context Driven Productivity

How do we fix it? How can we apply a stronger methodology to process learning and autonomy to reach the highest levels on the autonomous journey? Ephesoft has introduced a new way of examining the context problem called Context Driven Productivity (CDP). Let’s take a look at what CDP means for the organization.

In our vision, the enterprise has the means to provide contextual data to all humans, digital workers and systems, removing the contextual wall. CDP takes a new and innovative approach to acquire data from existing processes, enrich that data and finally amplify the value of that data to increase productivity. It is not a product or a technology, but a framework to reach highly autonomous processes, with a focus on providing a 360-degree view of all the information to understand and make fully informed decisions through the presentation of timely data.

CDP consists of three core pillars that provide a foundation to understand a process or task, all its required data, and then leverage that info today, and use it for the future. Those pillars are: acquire, enrich and amplify.    

Let’s return to how do we build context? Is it just a matter of collecting data? The answer is “No,” because if that was the case, it would be easy. The best way to explain the required underlying technology is to think about a crime board. Everyone has seen a CSI show or one where they are trying to connect the dots to find a murderer or criminal. That crime board is a simple example of the complex technology behind the scenes and what the CDP methodology can build. Essentially it is a series of links and relationships between entities and their attributes.  In this case, we know where the murder weapon was found, its manufacturer and the DNA associated with it. We can start to gather how it all fits together. This physical crime board is what runs under the hood of our new platform that enables CDP.

To fully enable CDP, we need some new and innovative technology.

  • Semantic Acquisition Engine (enrichment-enabled data)
  • The ability to enrich data and create relationships and links (knowledge graph aka digital crime board)
  • Amplify the result (easy access to acquired knowledge through secure, shared services)

So, we’ve seen the problem, and talked about Context Driven Productivity. What is Ephesoft doing to solve this problem and improve productivity in organizations? I’ll now give you an overview of our new platform and vision to help customers drive deep knowledge of their core processes and improve user productivity.

The Solution: Semantik Platform

Semantik will address the key pillars in CDP through a robust set of tools directed at improving the acquisition and creation of semantic data, the enrichment of that data through internal and external sources and the means to leverage the data across the Enterprise. By providing enrichment-ready data at the front of the process, strong context can be built and saved into a knowledge graph to provide human and digital workers with an extensive means to complete tasks and processes quickly and automatically.

If we look under the hood, we have the technical infrastructure to create that digital crime board, and building context through AI using data entities, links and broad relationships. The end result is what is called an enterprise knowledge graph that contains a map providing a 360-degree view over time of anything the organization handles on a daily basis.

If we bring back our background check example that was hindered by the context barrier, we now have full access to the extended story of the resume example from before. We have a clear 360-degree view of all the data, how it is related and can create full context and understanding so effective decisions can be made quickly without the laborious gathering required.

Single Pane of Glass

The end result is to provide a single pane of glass view of compiled context that users can access, and that systems and digital workers have available in digital form. This pane is a visual looking glass into the entities, links and relationships that are formed through enriched data, providing all the information to complete a process or task.

At Ephesoft, we have mapped out a phased journey to support CDP initiatives worldwide, focused on the three pillars of CDP: acquire, enrich and amplify.

What Can Semantik Do Today?

We have started with a focus on acquisition, and taken our history in document capture as a focus for use cases and providing solutions. We have built a platform and all the components to look at document content in a new way through brand new machine learning models, and extract not only the data of interest but also to create a virtual semantic map of content that goes through the system. This digital fingerprint is enrichment-ready, semantic data set up for the next phase, but also has the ability to be leveraged today for specific acquisition point solutions. The first solution is Semantic Invoice and is currently in beta for those interested in testing. 

I want to emphasize this is not just another single solution we can provide to the market, but we have built a solid foundation to meet CDP acquire requirements as the phases develop. A foundation for future productivity solutions for almost instant ROI and automation. We’ve developed simple and easy to use screens, with the goal to have little to no training but high value.

We will also begin to integrate Ephesoft Transact into the phases of the Semantik Platform to provide extensive and enhanced capabilities in both the acquire and enrich phases. In the near future, we will leverage the Semantik stack and the benefits of this new technology. With our new CDP mindset, we also are starting to leverage our fuzzy DB, web services lookup and plugin architecture to provide enhanced enrichment of data during the content acquisition process. In addition, Transact can provide data labeling capabilities to enhance semantic models and provide clean accurate data sets to AI initiatives. If you aren’t familiar with our data labeling initiatives and the capabilities of Transact in this area, please contact us. It is a fast, emerging market for our technology.

View a recording of our session on our BrightTALK Channel


Ephesoft Announces 2019 Partner Awards

The Ephesoft partner ecosystem is an important and essential part of delivering innovative solutions to our customers. We have different strategic partner types: technology, alliance (global consultancies and system integrators), channel, service delivery and OEM partners. For 2019, Ephesoft awarded three partners for their contributions based on involvement, customer success, moving the relationship forward, revenue and reach. At this year’s Sales Acceleration Summit in Irvine, California, the following partners were announced as recipients.

Global Partner of the Year in 2019: Infor

Infor was recognized as our Global Partner of the Year in 2019 as they served all of our regions worldwide and demonstrated excellence in customer success and satisfaction. Infor is an OEM partner who sells Infor Document Management (IDM) Capture powered by Ephesoft. They excelled in the Americas, EMEA and APAC territories in the past year. Based on their continued dedication to product integrations that impact our joint-customer success, their solution’s impact on the market and their ability to demonstrate real-world environments in the field, Infor earned out top accolade. Our joint partnership offers organizations a superior solution to operate seamlessly, automatically and intelligently – anywhere in the world.  

Channel Partner of the Year in 2019: Shamrock Solutions 

Shamrock Solutions was awarded as our Channel Partner of the Year in 2019. They specialize in systems integration with a focus on ECM, RPA and advanced content capture. They have a remote workforce spanning North America and offer support, services and consulting work to over 300 customers annually. As pioneers in the industry, they have become experts and found major success by having intelligent content capture discussions within every encounter. Shamrock offers multiple options in order to be a truly agnostic business automation supplier, meaning their client needs always comes first! Shamrock had a stellar 2019 by doubling their revenue as well as making the top partner list for 3 of their vendors.   

Global Alliance Partner of Year for 2019: Grant Thornton

Forming a partnership with Ephesoft early in the year, Grant Thornton also received the Global Alliance Partner of Year for 2019. As a premier consulting and accounting firm in the United States, Grant Thornton recognized the need for unparalleled content capture in their client’s digital transformation projects offering Ephesoft to their client base across North America. Grant Thornton takes a client-first approach in all their engagements, truly assessing the business process and needs of the organization before considering the technical components of any solution. It’s this unique consulting approach that makes the Grant Thornton team successful in the field and successful in their sales and implementation efforts.

If you are interested in joining our partner program, please visit https://ephesoft.com/partners/become-a-partner/