Skip to content

vonhachtaugust/ghost-ocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ghost-ocr

Extract text from 'glimpsing' a pdf.

The idea is to call this code as a subprocess in e.g. python for machine learning purposes.

Getting Started

The project is built using CMake version >= 3.9.3. A few CMake scripts should give enough information regarding what is missing.

Prerequisites

Ghostscript, libpng, tesseract-ocr (don't forget the language data files found externally).

Running the tests

Set build test ON in the project CMake file. Thereafter, building again should download googletest gtest/gmock. The tests are based on these libraries.

Contributing

Feel free to contributing. As for now, the only requirements for contributing is using the same clang-format.

Authors

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

Extract text from 'glimpsing' a pdf

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published