Ephesoft Insight is a document mining analytics tool that enables organizations to unlock the business intelligence trapped in unstructured content. Tapping into the data stored in document management systems, file shares and line-of-business applications allows organizations to reduce risks, drive profits, improve operations and support strategic decision-making.

Insight is powered by Ephesoft’s patented machine learning technology. To put it simply, the application learns how to classify or organize documents, so manual processes are reduced, often up to 90%. Once the document type is recognized and metadata is extracted, Insight’s analytics engine works to correlate data, or as we commonly say, “connect the dots.” Organizations use this feature to mitigate risk, identify fraudulent activity, recognize business opportunities and make better operational decisions, all based on real data.

Insight’s automatic document indexing capability is powered by “multidimensional” analysis, which accurately categorizes content and extracts data from records. There are key dimensions of analysis when looking at text embedded in a document: anchor blocks, data values, knowledge bases and natural language processing (NLP). These dimensions are considered independently to assign a confidence score for classification and data extraction. Those confidence values are then aggregated using a weighted-mean calculation, which helps increase accuracy in the data that is collected.

multi-dimensional analysisA breakdown of the mechanics of multidimensional analysis:

  • Anchor blocks: These are spatial relationships and can identify candidate values associated with predefined values, as both have relationships with their respective anchor blocks.
  • Values: Ephesoft Insight detects and learns textual patterns, which can be considered a dictionary of predefined data types. One example of a recognized textual value is an address – the application automatically recognizes the composite of a street number, city, state and zip code as a US address.
  • Knowledge bases: These are industry-specific data dictionaries that draw meaning based on an organization’s requirements. They allow customization and analysis of data that applies directly to their interests.
  • Natural Language Processing (NLP): Insight utilizes NLP to identify blocks of text based on the content of the block or paragraph itself. When working with highly unstructured document such as contracts, this dimension allows Insight to capture data from relevant sections of the document.

Through multidimensional analysis and integrated analytics, Insight allows companies and government agencies to transform unstructured content buried in document repositories into structured data sets without the need for rigid document-specific templates or hundreds of samples to create the learning model. In its simplified form, the tool transforms documents into actionable data to help businesses make better business decisions and “connect the dots.”