{"id":2771,"date":"2015-01-22T01:04:10","date_gmt":"2015-01-22T08:04:10","guid":{"rendered":"https:\/\/ephesoft.com\/docs\/?p=2771"},"modified":"2020-08-26T15:59:56","modified_gmt":"2020-08-26T22:59:56","slug":"kb00007775-having-trouble-extracting-data-from-a-table-using-the-column-header-and-column-coordinates","status":"publish","type":"post","link":"https:\/\/ephesoft.com\/docs\/kb00007775-having-trouble-extracting-data-from-a-table-using-the-column-header-and-column-coordinates\/","title":{"rendered":"KB00007775: Issue with Table Extraction Using the Column Header and Column Coordinates"},"content":{"rendered":"
Trouble extracting data from a table using the Column Header<\/strong> and Column Coordinates<\/strong>.<\/span><\/p>\n Column Headers<\/strong> are highly dependent on the recognition of the Column Header Pattern in the OCR. Variations in the OCR can cause the table or column not to be extracted properly or complete rows to be skipped. For example, If you configured the extraction rule to look for a column named \u201cPart Number\u201d but the OCR value is \u201cPert Number\u201d the column or table will not extract properly.<\/span><\/p>\n Column Coordinates<\/strong> will take the values identified based on a zonal pattern and extract the contents within. If your zone coordinates are not defined properly you may get values pertaining to the column next to it as well.<\/span><\/p>\n To resolve the issues regarding the recognition of Column Headers<\/strong>, you may need to account for variances in the OCR.<\/span><\/p>\n Here are some potential solutions:<\/span><\/p>\n For example, for a specific Column Header Pattern<\/strong> like \u201cPart Number\u201d, use a more generic Regex such as \u201cP[A-z0-9\\s]{7}ber\u201d. This finds any variation of alphanumeric values that start with a \u201cP\u201d and end with \u201cber\u201d.<\/span><\/p>\n To resolve any issues with the Column Coordinates<\/strong>, you may need to simply adjust your zonal areas so they fit and account for variations in the images. Variations include:<\/span><\/p>\n To ensure the best results you should try to standardize your input images and have a minimum quality requirement. For example, Resolution: 2550\u00d73300, DPI: 300.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":" Issue Trouble extracting data from a table using the Column Header and Column Coordinates. Root Cause Column Headers are highly […]<\/p>\n","protected":false},"author":62,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12353],"tags":[722,1430,437,743],"yoast_head":"\nRoot Cause<\/strong><\/span><\/h2>\n
Solution<\/strong><\/span><\/h2>\n
\n
\n