School/College Stationary List OCR and Parsing
-
Updated
Apr 3, 2017 - C++
School/College Stationary List OCR and Parsing
A Python command-line utility intended for automating some copyediting tasks in documents. It allows editing zipped, XML-based files (e.g. docx, odt, or epub), through XSLT stylesheets. Can be rather easily extended with your own custom xsl stylesheets.
Convert scans of handwritten notes to PDF.
A document preprocessor that works in conjunction with tools like groff/troff & refer.
An implementation of basic IR techniques from scratch.
Program Helps remove watermark from a pdf document
Apply keyword procedures in a given Racket namespace using X-expressions.
tokyo, a REST API, when given any type of document 📄, Identifies mime-type 🧐. Suggests extension 🦔. Alas Extracts text 💪.
Semantic extraction from conference proceedings.
This library builds a graph-representation of the content of PDFs. The graph is then clustered, resulting page segments are classified and returned. Tables are retrieved formatted as a CSV.
An include filter for Pandoc
Generic framework for historical document processing
This set of robots provides support for automatically obtaining information from invoices using docDigitizer API and keep track of the processed invoices on an Airtable repository
A comprehensive list of annotated training datasets classified by use case.
FileGazer - deep file analysing and categorisation
Spire.Doc for C++ is a professional Word C++ library specifically designed for developers to create, read, write, convert, merge, split, and compare Word documents on any C++ platforms with fast and high-quality performance.
Unofficial mirror of git://git.lyx.org/lyx.git (updates daily. not affiliated with lyx.org.)
Minimize the time requirement of audit report analysis with a containerized file conversion and scraping system
通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script
Add a description, image, and links to the document-processing topic page so that developers can more easily learn about it.
To associate your repository with the document-processing topic, visit your repo's landing page and select "manage topics."