image_text_reader

It's a very basic tool to read images , images formatted like a restaurant-menu.

Tesseract-ocr

This tools need tesseract-ocr engine. Help yourself with this --

https://github.com/tesseract-ocr/tesseract/wiki

Linux

Tesseract is available directly from many Linux distributions. The package is generally called 'tesseract' or 'tesseract-ocr' - search your distribution's repositories to find it. Thus you can install Tesseract 4.x and it's developer tools on Ubuntu 18.x bionic by simply running:

sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

Refer here for more on installation on all other systems.

macOS

Homebrew

To install Tesseract run this command:

brew install tesseract

OCR reads the text extracted image from the full image. Click here

Command to use:

Dockerized image reading

docker run -it yardstick17/image-text-reader bash -c "PYTHONPATH='.' python3 read_image.py read_text_from_local_image -f images/sample_image.jpg"

Read from url

PYTHONPATH='.' python3 read_image.py read_text_from_image_url -u https://marketplace.canva.com/MACHUlPU93Q/1/0/thumbnail_large/canva-peach-green-leaves-garden-vegetarian-pizza-menu-MACHUlPU93Q.jpg

[2017-07-07 16:20:34,119] INFO : Downloading image from url: https://marketplace.canva.com/MACHUlPU93Q/1/0/thumbnail_large/canva-peach-green-leaves-garden-vegeta
[2017-07-07 16:20:35,997] INFO : Saving file: /var/folders/cz/n3vkz7x91qs06nmm9byxxgz00000gr/T/tmpienrxu2c
[2017-07-07 16:20:35,997] INFO : Processing image for text Extraction
[2017-07-07 16:20:36,308] INFO : Removing noise and smoothening image
[2017-07-07 16:20:36,431] INFO : Reading the text inside the contour plotted

Read from local image

PYTHONPATH='.' python3 read_image.py read_text_from_local_image -f images/sample_image.jpg

[2017-07-07 16:32:38,862] INFO : Processing image for text Extraction
[2017-07-07 16:32:39,232] INFO : Removing noise and smoothening image
[2017-07-07 16:32:39,442] INFO : Reading the text inside the contour plotted

Deploy an api for reading text from image!

PYTHONPATH='.' python3 api/app.py

[2017-07-07 16:49:57,818] INFO :  * Running on http://0.0.0.0:6600/ (Press CTRL+C to quit)
[2017-07-07 16:49:57,820] INFO :  * Restarting with stat
[2017-07-07 16:49:58,712] WARNING :  * Debugger is active!
[2017-07-07 16:49:58,738] INFO :  * Debugger pin code: 316-405-633

Sample api deployed on my tiny server. Please be patient with them.

curl -X POST \
  http://54.254.214.96/read_image_from_file/url \
  -F url=https://africatalentbank.com/wp-content/uploads/2014/10/Menu.jpg

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
api		api
clean_text		clean_text
image_preprocessing		image_preprocessing
images		images
tesseract_interface		tesseract_interface
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.travis.yml		.travis.yml
Dockerfile		Dockerfile
Florentia-Thin-trial.ttf		Florentia-Thin-trial.ttf
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
conf.py		conf.py
index.rst		index.rst
read_image.py		read_image.py
requirements.txt		requirements.txt
sphinx_to_gh_pages.sh		sphinx_to_gh_pages.sh

License

yardstick17/image_text_reader

Folders and files

Latest commit

History

Repository files navigation

image_text_reader

Tesseract-ocr

Linux

macOS

Homebrew

OCR reads the text extracted image from the full image. Click here

Command to use:

Dockerized image reading

Read from url

Read from local image

Deploy an api for reading text from image!

Sample api deployed on my tiny server. Please be patient with them.

Digital Menu

Original Image

About

Topics

Resources

License

Stars

Watchers

Forks

Languages