KB00023068: ImageMagick impacts performance due to heavy file size

Component:

Performance, ImageMagick, Tiff to PNG conversion, CREATE_OCR_INPUT Plugin, CREATE_DISPLAY_IMAGE Plugin

 

Issue Description:

There can be scenario’s where our customer’s observe that PPM of their batch class is degrading and Batch Instances are taking most of the time in CREATE_OCR_INPUT plugin and CREATE_DISPLAY_IMAGES plugin. Sometimes the completion of TIFF to PNG conversion carried by ImageMagick in these plugins can take very long.

This wiki article helps to explain the reason behind why ImageMagick sometimes takes too long in the above mentioned plugins.

 

Analysis:

  1. CREATE_OCR_INPUT Plugin and CREATE_DISPLAY_PLUGINS are the one which are involved in TIFF to PNG conversion either for producing a PNG file that needs to be provided as an INPUT to OCR engine as well as to Create Display Images to be displayed at Review and Validate stage.
  2. Image conversion process is carried out by a third party tool called ImageMagick which helps in generating the PNG files.
  3. ImageMagick processing rate depends upon the size of the image. It not only depends upon the size of image on disk but also considers size in memory. It is the expected behavior of ImageMagick to take time if the size of the file in memory is heavy.
  4. As you can see in the below screenshot the size of the images in memory is quite large because of which ImageMagick takes time processing these images.

      

 

5. Therefore the performance of the batch class will vary for these two plugins as per the size of the file.

 

Suggestion to improve performance:

  1. If Recostar Color Switch under RECOSTAR_HOCR plugin is OFF, CREATE_OCR_INPUT plugin will not be needed to generate PNG files. Instead of PNG tiff files will be used for OCRing. With this CREATE_OCR_INPUT plugin can be removed from the workflow. This will save the time taken by CREATE_OCR_INPUT plugin which is almost same as time taken by CREATE_DISPLAY_IMAGE plugin.
  2. Please note that in order to get this working you will need to remove the dependency of CREATE_OCR_INPUT plugin  from RECOSTAR_HOCR and RECOSTAR_EXTRACTION PLUGIN from SystemConfiguration -> Modules -> Plugin UI page.
  3. Regarding particular plugins CREATE_DISPLAY_IMAGE and CREATE_OCR_INPUT,they are working in an expected manner i.e. time taken by these plugins is the time taken by ImageMagick for conversion (Which is taking time because of images are quite heavy as mentioned earlier).