Transact

⌘K
  1. Home
  2. Transact
  3. Features and Functions
  4. Administrator Role and Fe...
  5. Reporting
  6. Ephesoft Transact Reporting and Workflow

Ephesoft Transact Reporting and Workflow

Introduction

Reporting consists of the following major functions:

  • Data Migration
  • Data Calculation
  • Data Cleanup

Ephesoft Transact Reporting also supports multiple languages as part of the User Interface. For more information, refer to Global Languages.

Note: This document applies to Ephesoft Transact 2019.1 and above.

Reporting Process – Cron Jobs

A single master cron job controls the entire reporting process from data migration, to report calculations, to data cleanup. The following workflow steps are performed based on the cron job execution:

  1. Activity tables are optimized and unnecessary data is removed.
  2. Data used for the reports are migrated from the Application database to the Reports database and persisted to staging tables.
  3. Aggregation and calculations are performed from the staging tables and the results persist in calculation tables that are used when rendering the reports.
  4. The final data cleanup occurs on the Ephesoft Transact application tables and the staging tables used in the Reports database.

This entire sequence is executed by default every 15 minutes.  This is all controlled by a cron job in the reporting property file.

Refer to the following snapshot for an illustration of these cron jobs in the dcma-reporting.properties file, and the default settings:

Figure 1. Cron Job Expressions and Default Settings in the dcma-reporting.properties File

  • dcma.report.dashboard.serverStatus.cronExpression
# Cron expression for execution frequency of reporting jobs
    • By default, this is set to 15 minutes.
  • dcma.report.reporting.cronExpression
# Cron expression for archiving report data to the archive database
  • This operation should be configured to execute during off-hours (weekends, late night).
  • By default, this is set to 11:35 pm, Sunday night.
# Cron expression for archiving report data to an archive database.
    • This property defines the number of days of data to leave in the reporting database when archiving or purging occurs.
    • Note: Changing the setting to execute more frequently may impact system performance.
  • dcma.report.reportData.duration
# Cron expression that defines the number of days of data to leave in the Reporting database, at the time of executing the archival or purging job.
    • By default, this is set to 30 days. Any setting reflects the number of days reporting data is retained in the Reports database before the data is migrated to the Report Archive database.
    • Setting this to a higher value may impact system performance based on how much volume in terms of throughput is being processed. Customers processing over 3000 batches per day should use the default.
  • dcma.report.enable.reportingServer
# Cron expression for enabling or disabling this server to run ETL jobs
    • If true, this server is enabled to run reporting ETL jobs.
    • If false, ETL jobs are disabled for this server.
    • In a multi-server environment, one of the servers should have this setting as true.
    • In a single-server environment, this setting should always be true.
    • Customers may wish to disable ETL reporting for all servers. Set all servers to false. This option was not supported in Ephesoft Transact 4.5.x.x and previous releases.
  • dcma.report.enable.reporting
# Cron expression for cleanup and reports calculation, and specifies whether reporting is executed on this server, or not
    • This setting is useful for configuring reports in a multi-server environment. This setting can prevent report calculations from being performed on all servers. Instead, calculations can be performed on just one server.
    • If true, reporting data on this server will be collected and migrated to the Reports database.
    • If false, data cleanup is executed in the Applications database and no reports are calculated.
  • dcma.report.cleanupBatchInstance
# Cron expression enables or disables batch instance cleanup
    • If true, the reporting cleanup cron job also cleans the batch instance and associated tables for the finished and deleted batch instance entries.
    • If false, the user is responsible to clean up batch instance data and associated tables.

Note: If the user does not perform manual cleanup regularly, the system accumulates a large number of records in the batch instance table, and this will impact system performance.

Data Migration in Ephesoft Transact

Data migration occurs from the following Ephesoft tables to the Reports database:

  • batch_instance
  • act_hi_procinst
  • batch_class
  • batch_class_module
  • rp_throughput_module_data — This new table persists the number of documents and pages at the end of every workflow module in the batch workflow.
  • rp_data_corrections — This new table improves the correction reports as a part of Advanced Reporting. This new table stores JSON objects for more efficient mapping to the jobs that perform the calculations.

The default frequency of this migration cron job is every 15 minutes. Once Transact has migrated the tables successfully, Transact then runs the Advanced and Dashboard report jobs.

Note: The job for running Advanced Reporting only runs if the customer has purchased the Advanced Reporting license. Otherwise, this job does not run.

The batch.xml data that is required for reporting now persists to the Application database. The data migration job can leverage the necessary data strictly from the Application database. This makes data migration more efficient and simpler.

Data Calculation in Ephesoft Transact

Ephesoft Transact performs two separate calculation jobs:

  • Dashboard
  • Advanced Reports

Data Cleanup in Ephesoft Transact

Activiti has the following four history tables:

  • ACT_HI_PROCINST

Note: This table is used in reporting.

  • ACT_HI_ACTINST
  • ACT_HI_VARINST
  • ACT_GE_BYETARRAY

Ephesoft Transact 2019.1 performs data cleanup with the following factors in design:

  • The cleanup process is broken down into two separate jobs after the Advanced and Dashboard jobs are complete:
    • Cleanup tables in the Application database
    • Cleanup staging tables in the Reports database

These cleanup jobs run every 15 minutes. Once the batch instance is finished, the application deletes the history records.

  • Ephesoft Transact stores only relevant fields from the batch.xml file that are required for correction reports in these tables. These fields are stored as JSON objects.
  • The ApplicationcCleanup job is responsible for cleaning up Ephesoft tables where there were batch instance tables.
    • The ReportCleanup job cleans the staging tables that are used during migration.
  • The jobs are executed in sequential order:
    • Cleanup Procedure — The database contains a procedure that cleans the tables in the database.
    • Activiti Workflow Engine — When executing a batch instance, Ephesoft Transact uses an Activiti workflow engine. Activiti creates history tables. Three history tables are not needed in Ephesoft Transact, so the reporting method removes these.

Database Procedure for Cleanup

Reporting cleans the Activiti history table with dbo-CAL_ACT_HI-CLEANUP procedure.

The Activiti history table consists of the following three tables:

  • dbo.ACT_HI-ACTINST
  • dbo.ACT_GE_BYTEARRAY
  • dbo.ACT_HI_VARINST

The dbo-CAL_ACT_HI_CLEANUP procedure also updates the dbo.batch_instance. Reporting uses the dbo.batch_instance column to monitor status and cleans up the batch instance in the finished status.

  • These staging tables are persisted in the Reports database and used as part of data calculation that is used for the throughput (base) and corrections reports as part of advanced reporting.
    • dbo.rp_data_corrections — contains the batch.xml file data relative to corrections during each step of the Transact workflow
    • dbo.rp_throughput_module_data — tracks documents and pages in each batch instance for each module

Guidelines for Running Ephesoft Transact Reports

This topic provides guidelines and recommendations for running reports in Ephesoft Transact:

  • It is advisable to run the Ephesoft Reporting Service on a UI Server.
    • Running the Reporting Service on an executing server may cause delays in the Report ETL Job execution due to high CPU utilization by batches.
  • Dashboard and Standard report scripts and cron jobs have been merged into a single entity. This means both Dashboard and Throughput Reports will be updated at the same time in the UI. The Cron for this common ETL Script (Dashboard) should be frequent — every 15 to 20 minutes (the default setting is every 15 minutes).
  • The clean-up algorithm will delete data for FINISHED batches for which all reports have been successfully run (as per the license).
    • Along with this, XML files from SharedFolders/report-data and Activiti tables for the same batch instances will also be cleared.
  • Ephesoft recommends that all cron jobs should be configured so that minimal collisions occur.
    • For example, the cleanup cron could be set at the 45th/50th minute of every “n” hours to ensure that the chances of another cron being fired at the same instance are less.
  • ETL scripts are memory intensive due to the frequent sorting and aggregation of data. Hence, as per the load of batches, the JVM heap memory should be configured accordingly.

Low-Load Customers

The following specific recommendations apply for low-load customers (below 1000 pages/day):

  • The Dashboard Cron job can be set to a more frequent interval (every five minutes).
  • JVM Heap memory does not need to be adjusted or notably increased for the reporting to function properly.
  • Cron timings should be exclusive of each other by ensuring they do not get fired at the same time. This can be achieved by specifying the exact minute for the cron to be triggered. For example, 0 23 0/5 * * * denotes a cron that will be executed 23 minutes after every five hours.

Medium and High-Load Customers

The following specific recommendations apply for medium-load customers (1000-10,000 pages/day) and high-load customers (above 10,000 pages/day):

  • Multi-Server Setup is recommended with the Reporting Service running on a UI server.
  • The Dashboard Cron (which will update data for Dashboard and Throughput Reports) should be set to a slightly longer interval — every 15 minutes to allow scripts more time to complete.
  • JVM Heap memory should be increased in cases where extremely high numbers of batches are being processed regularly.
  • Cron timings should be exclusive of each other by making sure they do not get fired at the same time. This can be achieved by specifying the exact minute for the cron to be triggered. For example, 0 23 0/5 * * * denotes a cron that will be executed in 23 minutes after every five hours.

Configuring Reports

Ephesoft Transact allows administrators to adjust system-level configurations for Ephesoft reports, as follows:

Properties, .lgx and Script Files

Administrators may wish to adjust the Ephesoft Transact properties files, .lgx file, or ETL scripts for certain report functions and parameters. Possible configurations include the following:

  • etl-variables.properties
  • dcma-reporting.properties
  • application.properties
  • dcma-backup-service.properties
  • _Settings.lgx
  • ETL scripts

The following property file needs to be configured for connecting to the report database: {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/etl-variables.properties.

By default, the property is configured to point to the report database created by Ephesoft. If the administrator wishes to use a different database, this property file needs to be configured accordingly.

All properties are of type string.

Configurable Property Value Option Description
reporting.driverClassName For MySQL: com.mysql.jdbc.Driver

For MSSQL: jdbc:jtds:sqlserver://localhost; databaseName=report; user=<databaseUsername>; password=<databasePassword>

Driver class for database connection.

Example: for MySQL, it should be set to com.mysql.jdbc.Driver

reporting.dialect NA Dialects used by hibernate to use with database.

Example: org.hibernate.dialect.MySQL5InnoDBDialect

reportdata.path NA Path to the folder containing files for all executed batches.

Example: D:\\Ephesoft\\SharedFolders/report-data

Ephesoft.loginUserName NA This property is used to decrypt report data for encrypted Batches. Please enter SuperUser Name. If encryption of property file values is enabled, this value should also be encrypted.
Ephesoft.loginPassword NA This property is used to decrypt report data for encrypted Batches. Please enter SuperUser Password. If encryption of property file values is enabled, this value should also be encrypted.

Property file: {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/dcma-reporting.properties

Configurable Property Type of Value Value Option Description
dcma.report.dashboard.cronExpression Cronjob Expression NA Frequency with which Report Dashboard data and Throughput Report data is refreshed. Recommended value: “0 0/15 * ? * *” (Every 15 minutes)
dcma.report.dashboard.server
Status.cronExpression
Cronjob Expression NA Frequency with which Server Status data is refreshed. Recommended value: “0 0/1 * ? * *” (Every minute)
dcma.report.advanced.cronExpression Cronjob Expression NA Frequency with which Report Dashboard data is refreshed. Recommended value: “0 0 0/12 ? * *” (Every 12 hours)
dcma.report.cleanup.cronExpression Cronjob Expression NA This cron expression governs the time period with which Batches will be cleaned from Ephesoft. Any batch instance for which all three Reports have been generated, will be cleaned up when the cron triggers. (Report Data Folder, Database, Activiti Tables)
application.last.version DO NOT alter this property. This property is managed by Ephesoft to govern the upgrade status of Reports database. As an indicator, the presence of a dollar sign ($) before the application version confirms that upgrade process completed successfully

Property file: {Ephesoft-Home}/WEB-INF/classes/META-INF/application.properties

Configurable Property Type of Value Value Option Description
enable.reporting String True

False

Whether backup report-data files would be created.

Property file: {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-util/dcma-backup-service.properties

Configurable Property Type of Value Value Option Description
backup.report_folder String NA Path to the folder containing back up files for all executed plugins for batches.

Example: C:\\Ephesoft\\SharedFolders/report-data

Lgx file: {Ephesoft-Home}/EphesoftReports/_Definitions/_Settings.lgx

This file contains connection details to connect with the existing report database and the error location path. The report database can either be in MariaDB or SQLServer. Follow these steps to configure this file:

  1. Define the following parameters to establish the connection:
    • Connection Id
    • Type (SQL Server or MariaDB)
    • Port Number
    • Username
    • Password
    • Database name
    • Server name
  1. Specify the error location path as follows: {Ephesoft-Home}/EphesoftReports/logs/dcma-report.log

ETL Scripts: ETL Scripts (running in the background to process data) are stored in the following locations:

  • {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/Dashboard
  • {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/Standard
  • {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/Advanced

Scripts must be switched if the administrator needs to change the Reporting Database from MS SQL to MariaDB, or vice versa.

The ETL scripts for each respective database are at the following locations on the Transact server:

  • MS SQL — {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/MSSQL
  • MariaDB — {Ephesoft-Home}/WEB-INF/classes/META-INF/dcma-reporting/MariaDB

Encrypted Batch Class Configuration

The configurations are stored in the following file: {Ephesoft Installation Directory}\Application\WEB-INF\classes\META-INF\dcma-reporting\etl-variables.properties

The following two properties are added for Encrypted Batch support:

  • ephesoft.loginusername=<Username for which Ephesoft Web Services are authorized>
  • ephesoft.loginpassword=<Password for the above Username>

The user can support encryption for the above properties by using the following settings:

  • A combination of encrypted strings from Ephesoft Encryptor (<Ephesoft Installation Directory>\Application\native\encryption\Encryptor.bat)
  • Encryption switch in <Ephesoft Installation Directory>\Application\WEB-INF\classes\META-INF \dcma-encryption\dcma-encryption.properties.

Export Options for Reports

The following export options are available at the bottom of the widgets for all Ephesoft Reports:

  • Export PDF: User can export the report in PDF format.
  • Export Excel: User can export the report in excel format.
  • Print Report: User can open the report to a new tab/window and then print the report.

Failover Mechanism for Reports

Reports are supported for a multi-server Ephesoft environment as well. Reporting is accessible on all servers.

The user interface of Reports is present and hosted on all servers. Hence, it can be assumed that if Ephesoft is running on a machine, Ephesoft Reports will be accessible via localhost Ephesoft Application.

The Back-End (ETL Scripts) is governed by Ephesoft’s failover mechanism. ETL Scripts are stored at a shared location and only one server can run these scripts at a time. If that server shuts down for any reason, Ephesoft Heartbeat is triggered the failover mechanism and the ETL Scripts service are assigned to one of the remaining servers.

The server_registry table below displays the ETL Service running on Ephe32 system.

https://wikidev.ephesoft.com/wp-content/uploads/2015/11/word-image120.png

Figure 2. server_registry Table Example

The service_status table below displays the server_registry_id.

https://wikidev.ephesoft.com/wp-content/uploads/2015/11/word-image122.png

Figure 3. service_status Table Example

In case of server failure, when the failover mechanism takes over, ETL Service is assigned to EphesoftSVR as displayed in the server_registry table below.

https://wikidev.ephesoft.com/wp-content/uploads/2015/11/word-image124.png

Figure 4. ETL Service Assigned to EphesoftSVR in server_registry Table

The service_status table below displays updated server_registry_id.

https://wikidev.ephesoft.com/wp-content/uploads/2015/11/word-image126.png

Figure 5. service_status Table Displaying Updated server_registry_id

Last Sync Time

All Ephesoft Transact reports display the last sync time with the Ephesoft Transact server and services in the following format:

  • YYYY/MM/DD HH:MM:SS.

https://wikidev.ephesoft.com/wp-content/uploads/2016/10/word-image-59.png

Figure 6. User Throughput Report as Sample Format

Purging Mechanism

A purging mechanism archives data periodically in Ephesoft Transact for reporting. The frequency of this archival can be controlled by the user.

The user can decide the extent of archival. This means the user can control the amount of data to be archived based on how old it is.

Configurations

The user can control the purging mechanism using the following parameters:

  • Location: <Ephesoft_Home>\Application\WEB-INF\classes\META-INF\dcma-reporting
  • Filename: dcma-reporting.properties
  • Property name: dcma.report.purging.cronExpression (Default Value=Every Third month on the 1st Day at 12am)

This property can be configured to schedule the Purging Job using regular cron expressions.

  • Filename: etl-variables.properties
  • Property Name: dcma.report.reportData.duration (Default Value = 30 days)

As an example, if the value of this property is set to 90, this means that at every scheduled purge cycle, data older than 90 days will be archived. Recent Data (<90 days) will be retained.

  • If a user wants to archive the complete data every time, they can set the value of the property to 0.

The “reporting.purge_duration” property denotes the time period (in days) prior to which all data will be purged.

  • As an example, if the purge_duration = 5 days, batches with creation_date older than Current date will be purged.
    • Taking current date as 06-06-2016 5:00pm.
    • Batch Instance with creation date = 30-04-2016 1:00pm will be Purged
    • Batch Instance with creation date = 31-04-2016 1:00pm will be Purged
    • Batch Instance with creation date = 31-04-2016 8:00pm will NOT be Purged (number of days = 5, taking difference of 06-06-2016 5:00pm and 31-04-2016 8:00pm)
    • Batch Instance with creation date = 01-06-2016 1:00pm will NOT be Purged (number of days = 5, not greater than 5)
    • Batch Instance with creation date = 01-06-2016 5:30pm will NOT be Purged
    • Batch Instance with creation date = 03-06-2016 1:00pm will NOT be Purged
    • After purging, the Dashboard data is recalculated for remaining data.

Note: Batches that are in non-FINISHED state and are purged will be added back into original report DB. This is because the Status of these batches may change later on and reporting jobs need to process these changes. E.g. For a batch in the READY_FOR_REVIEW state, purging will copy the data into archive DB. But, in the next run of Dashboard job, this batch will be repopulated into Report DB. This is to make sure that once the state of batch changes to RUNNING, correct data is displayed on the reporting UI.

Steps to Use the Purging Mechanism

Perform these steps to use the Purging mechanism:

  1. Configure dcma.report.reportData.duration according to this requirement.
  2. Configure dcma.report.purging.cronExpression according to this requirement.

As a result of these configurations, the following processes are enabled:

  • After every purge cycle, data will be moved to archive database.
  • All remaining reports will be recalculated on the basis of the remaining data in the original Reports database.

Printing Ephesoft Reports

The Print feature enables the current report to be copied to new tab or window. The user can then use the print dialog on the new tab or window to print the current report.

A specified time delay is added before the report gets printed so that the charts are fully displayed on the new tab or window before being printed.

Reports Database Reporting Views

The following views are leveraged to perform on the fly calculations as the source of the corresponding reports:

 Aggregated Data View Name  Description
 batch_count_per_priority_1_v  Batch count per priority
 batch_count_per_status_1_v  Batch count per status
 batch_per_priority_v  Batch per priority
 batch_per_priority_status_v  Batch per priority status
 review_validate_backlog_1_v  Review Validate backlog view
 review_validate_backlog_dtls_v  Review Validate backlog details

The following are applicable for Ephesoft Transact 2020.1 or above:

 Aggregated Data View Name  Description
 batch_class_ppm_v  Batch class pages per minute
 module_duration_v  Module duration
 review_validate_module_dur_v  Review Validate module duration
 review_validate_plugin_dur_v  Review Validate plugin duration

Batch Name Column

The Batch Name column in the Batch Instance Report Data table contains names for the specific batch instance being run. The batch names in Batch Name and Batch Class Name may differ. An example is shown below.

Figure 7. Batch Instance Report Data — Batch Name Column