I do OCR on PDF file image : it contains numbered lines and I have to extract value for each line
I was able to define the anchor area (line number) and OCR area. It worked well with the first file
I selected a second file and ran the process on it and OCR failed
Anchor area is the same (a line number) but it seems the OCR area is not recognized . For what I understand the “image” of the OCR area is not the same hence the error .
So can you explain how OCR area is defined : is it just an area shape with coordinates defined based relative to anchor area ? In that case, the image found in that area is not relevant and it should work for any file with same pattern
Or does it really look for the captured image ? In that case I understand I have to capture a new image for OCR to work with each file
I don’t know if my question is clear … So … example
In file one I have lines :
107 : 1.204.258,52
108 : 20.205,25
109 : 2.548,02
So I captured an image where I defined the line number “107” as the anchor and the OCR area to capture the value of that line . And that works !
But if I run the script with a second file where lines are :
107 : 4.258,52
108 : 220.105,20
109 : 5.142.548,02
I receive an error on OCR
Can you please explain if I correctly understand how OCR work or should work for processing multiple files using same pattern ?
Thanks and sorry for being so long