Performs a very fast OCR on a list of images (file path, url, base64, bytes, numpy, PIL ...) using Tesseract and returns the recognized text, its coordinates, and line-based word grouping in a DataFrame.
-
Updated
Nov 14, 2023 - Python
Performs a very fast OCR on a list of images (file path, url, base64, bytes, numpy, PIL ...) using Tesseract and returns the recognized text, its coordinates, and line-based word grouping in a DataFrame.
Scripts to convert low-quality scanned PDFs to text files using Google Cloud Vision and GPT-3 for spellchecking
This is a small and simple cli ocr script to automatically ocr an image or split a pdf into images and then ocr the images of the pages.
Windows application for text decoding using the TesseractOCR library.
Text extraction from image through OCR
The app extracts tabular data from PNG, JPG, or PDF files uploaded by the user and converts it into a downloadable CSV file.
This repository contains a document scanner app that could perform Optical Character Recognition.
Extracting tabular data from scanned PDFs with OpenCV and PyTesseract.
A Baybayin OCR software package. These algorithms aim to recognize Baybayin texts at the character, word, and block levels.
Multiprocessing OCR with Tesseract
BizCardX is a Streamlit-based tool that uses OCR to extract and manage business card data. Easily upload cards, extract information, and store it in a PostgreSQL database.
A Python-based project that extracts text from images using Optical Character Recognition (OCR) techniques, leveraging the Tesseract OCR engine.
A simple desktop application to extract text from images using OpenCV and Pytesseract-OCR module of Python3.And the GUI is implemented using Tkinter module of python3.
Entity Recognition on images using OCR [ open source ]
Add a description, image, and links to the ocr topic page so that developers can more easily learn about it.
To associate your repository with the ocr topic, visit your repo's landing page and select "manage topics."