Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

males / female symbols do not decipher properly #34

Open
myrmoteras opened this issue Jan 4, 2023 · 1 comment
Open

males / female symbols do not decipher properly #34

myrmoteras opened this issue Jan 4, 2023 · 1 comment
Labels

Comments

@myrmoteras
Copy link

myrmoteras commented Jan 4, 2023

in the annals Natal Museum in the new version of GGI the male and female symbol do not display properly. londt_1982b.pdf

image as text cJ

This is an issue of the OCR of the original PDFs cJ. @gsautter do you see a way to fix this on the GGI side in a an automated way?

@gsautter
Copy link
Contributor

gsautter commented Jan 4, 2023

This is indeed due to errors in the OCR that comes with the PDF, accurate as it is otherwise ... no real way of fixing this in an automated way (which would basically require a really good and well trained OCR engine), but QC should be able to capture OCR conflicts now, and the cluster based correction facilities enable users to fix this with far less effort that it would be to find an fix all the individual instances of the symbol ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants