Ghost-ocr

Extract text from 'glimpsing' a pdf.

The idea is to call this code as a subprocess in e.g. python for machine learning purposes.

Getting Started

The project is built using CMake version >= 3.9.3. A few CMake scripts should give enough information regarding what is missing.

Prerequisites

Ghostscript, libpng, tesseract-ocr (don't forget the language data files found externally).

Running the tests

Set build test ON in the project CMake file. Thereafter, building again should download googletest gtest/gmock. The tests are based on these libraries.

Contributing

Feel free to contributing. As for now, the only requirements for contributing is using the same clang-format.

Authors

August von Hacht - Initial work - vonhachtaugust

See also the list of contributors who participated in this project.

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
cmake		cmake
include		include
src		src
test		test
.clang-format		.clang-format
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ghost-ocr

Getting Started

Prerequisites

Running the tests

Contributing

Authors

License

About

Releases

Packages

Languages

License

vonhachtaugust/ghost-ocr

Folders and files

Latest commit

History

Repository files navigation

Ghost-ocr

Getting Started

Prerequisites

Running the tests

Contributing

Authors

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages