Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ocrd cli #33

Merged
merged 20 commits into from
Apr 22, 2021
Merged

Ocrd cli #33

merged 20 commits into from
Apr 22, 2021

Conversation

kba
Copy link
Contributor

@kba kba commented Apr 13, 2021

No description provided.

@kba kba marked this pull request as ready for review April 14, 2021 08:38
@kba kba mentioned this pull request Apr 14, 2021
Copy link
Contributor

@bertsky bertsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Far from a deep understanding, but some things should be improved even in the first version I believe.

@vahidrezanezhad, please have a look whether my suggestions for the ocrd-tool parameter descriptions are correct.

@kba, please consider the common case where meta-data is present but has no DPI, and if it's not too late (or too much), try to improve the general (image/PAGE) data passing interface.

@kba
Copy link
Contributor Author

kba commented Apr 22, 2021

eynollah itself doesn't work with the page.imageFilename, so I don't see a reason to changing it.

yes it does: you pass it as image_filename kwarg below, which then gets opened with cv2.imread.

oh right, my bad, will change it to pass it the return of download_file.

Since we're encouraging METS-relative paths, there might be mets:files that are URL but I can't think of a scenario where pc:Page/@imageFilename would be a URL (except in our test data).

granted, but then why use mets.find_files(url=page.imageFilename) here at all?

Mostly because I want to make sure that the PAGGE @imageFilename is in the METS. If it isn't this will raise an IterationError.

Copy link
Contributor

@bertsky bertsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all makes sense to me now (but have not tested).

@vahidrezanezhad vahidrezanezhad merged commit d5be8ae into main Apr 22, 2021
@kba kba deleted the ocrd-cli branch April 22, 2021 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants