Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize with Timeout not stopping #290

Open
Ibariu opened this issue Dec 10, 2021 · 0 comments
Open

Recognize with Timeout not stopping #290

Ibariu opened this issue Dec 10, 2021 · 0 comments

Comments

@Ibariu
Copy link

Ibariu commented Dec 10, 2021

Hi everyone,

I have the next Image that I am trying to extract the text on it.

image2test

The expected value would some '\n' or blank spaces. The problem comes when I try to proccess it with the recognize function to stop the process in case it delays too much time but it does not stop and stays for 3h until the recognize function comes back with a False.

To set the image on the API I am using setImageFile, as far as I know it could avoid some trouble when loading it into the api (although I have also used "setImage(Image.open('image2test.jpg'))" ). Also mention that this page is being processed next to other pages extracted from a same PDF file. From this file PDF, of 2 pages, this page is the only one giving problems, causing the TesseractOCR 3h to extract its text. Considering that the image has no relevant information the process must be stopped and not processed. That page can not be deleted before the OCR, having into account that a lot of PDF might be processed and this isolated case may repeat in a unknown future wanting to catch this possible error.

The code I am using is the next:

image

Here you have more information of the API configuration, Tesseract or PIL version.

  • PSM: AUTO_OSD (1)
  • OEM: DEFAULT (3)
  • LANG: Spanish ('spa')
  • Tesseract: 4.1.1
  • Tesserocr: 2.5.1
  • Pillow: 8.4.0

Thanks for all the help 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant