Consists of various table-related inference calls for table reconstruction in documents. All the code is encapsulated in the 'tables' directory. The 'uploads' directory has sample images.
pip install -r requirements.in
Download sprint.pt from the Releases Section and place it in 'tables/model' directory.
Following table calls are integrated in this repository
Based on our trained Yolo model equipped for multilingual table detection.
python3 infer.py <page-image-path> td True
Based on SPRINT, our script-agnostic table structure recognizer can predict OTSL sequences.
python3 infer.py <table-image-path> tsr True
Uses YOLO-based table detector, SPRINT and Tesseract to generate an HOCR composed of text and tables in the inoput page image.
python3 infer.py <page-image-path> ocr True
cd tables
docker build -t tablecalls .
docker run --rm --gpus all -it -v '/data/DHRUV/Document-OCR-App/document-layout-ocr/uploads/table.jpg':/docker/uploads/tables.jpg tablecalls uploads/table.jpg tsr False
Uses streamlit to run all the required calls
streamlit run api.py