Getting Started

SPRINT: Script-agnostic Structure Recognition in Tables

Accepted at ICDAR 2024 Conference

The Repository contains code and details regarding the SPRINT model
The trained models are readily available for inference and testing in the releases section - link
The Details regarding the results are described through the readme

Abstract

Table Structure Recognition (TSR) is vital for various downstream tasks like information retrieval, table reconstruction, and document understanding. While most state-of-the-art (SOTA) research predominantly focuses on TSR in English documents, the need for similar capabilities in other languages is evident, considering the global diversity of data. Moreover, creating substantial labeled data in non-English languages and training these SOTA models from scratch is costly and time-consuming. We propose TSR as a language-agnostic cell arrangement prediction and introduce SPRINT — Script-agnostic Structure Recognition in Tables. SPRINT uses recently introduced Optimized Table Structure Language (OTSL) sequences to predict table structures. We show that when coupled with a pretrained table grid estimator, SPRINT can improve the overall tree edit distance-based similarity structure (TEDS-S) scores of tables even for non-English documents. We experimentally evaluate our performance across benchmark TSR datasets including PubTabNet, FinTabNet, and PubTables-1M. Our findings reveal that SPRINT not only matches SOTA models in performance on standard datasets but also demonstrates lower latency. Additionally, SPRINT excels in accurately identifying table structures in non-English documents, surpassing current leading models by showing an absolute average increase of 11.12%. To encourage further research, we release our code and Multilingual Scanned and Scene Table Structure Recognition Dataset, (MUSTARD) labeled with OTSL sequences for $1428$ tables in thirteen languages encompassing several scripts. Additionally, we also present an algorithm for converting valid OTSL sequences into a widely used HTML-based table representation.

Getting Started

Installation

1. virtualenv venv
2. source venv/bin/activate
3. pip install -r requirements.txt

Inference Implementation

To download the trained models - link

Inference could be done from src/infer.py file. Required parameters are present in src/config.py file for model loading and input data dir and parameters.
python infer.py to run and execute the code to get predictions in OTSL format
To Convert results from OTSL to HTML, run python src/otsl_to_html.py
TEDS evaluation has been setup in utils/teds_evaluation directory with instructions

Methodology

SPRINT

Datasets

The MUSTARD Dataset could be downloaded from this link - dataset link

Alternatively, you could use the Huggingface datasets library to download the dataset - link

MUSTARD Dataset

MUSTARD dataset has been curated from various magazines and contains both printed, scanned and scene-text tables

Results

Dataset	TableFormer + OTSL	TableFormer + OTSL	TableFormer + OTSL	Ours	Ours	Ours
	TEDS-S Simple	TEDS-S Complex	TEDS-S Overall	TEDS-S Simple	TEDS-S Complex	TEDS-S Overall
PubTabNet	96.50	93.40	95.50	98.20	96.24	97.55
FinTabNet	95.50	96.10	95.90	98.36	97.99	98.17
PubTables-1M	98.70	96.40	97.70	98.92	96.54	97.68

Modality	Language	TEDS-S Simple	TEDS-S Complex	TEDS-S Overall	TEDS-S Simple	TEDS-S Complex	TEDS-S Overall
		MTL-TabNet	MTL-TabNet	MTL-TabNet	Ours	Ours	Ours
Document Tables	Assamese	79.39	73.40	76.54	88.09	88.74	88.40
(Printed and	Bengali	71.68	60.02	61.42	77.24	78.52	78.36
Scanned)	Gujarati	85.12	76.72	79.63	87.79	81.34	83.58
	Hindi	73.80	76.60	75.04	85.68	88.22	86.81
	Kannada	68.82	66.73	67.20	71.84	79.02	77.34
	Malayalam	82.57	79.34	81.07	86.41	85.13	85.81
	Oriya	85.28	78.03	82.84	91.55	85.20	89.41
	Punjabi	65.08	48.63	51.54	86.91	79.65	80.93
	Tamil	81.96	71.88	77.83	94.91	85.87	91.21
	Telugu	85.07	79.28	82.17	93.70	86.00	89.85
	Urdu	70.94	69.74	70.03	81.39	75.38	76.86
	Chinese	92.43	81.58	86.15	98.11	86.00	91.10
Scene Tables	English	76.19	78.01	76.53	88.98	76.14	85.71
	Chinese	69.40	66.65	68.94	88.62	81.96	87.27
Overall		77.70	71.90	74.07	87.23	82.66	85.19

References

License

The work has been licensed by [MIT] license

Acknowledgements

We acknowledge the support of a grant from IRCC, IIT Bombay, and MEITY, Government of India, through the National Language Translation Mission-Bhashini project.

Authors Contact Information

Badri Vishal Kasuba
Dhruv Kudale

Questions or Issues

we conclude with opening doors to more innovative contributions bringing about seamless script-agnostic Table Structure Recognition. Thank you for your interest in our research paper

Citation

If you use this paper or the accompanying code/data in your research, please cite it as:

@InProceedings{10.1007/978-3-031-70549-6_21,
author="Kudale, Dhruv and Kasuba, Badri Vishal and Subramanian, Venkatapathy and Chaudhuri, Parag and Ramakrishnan, Ganesh",
editor="Barney Smith, Elisa H. and Liwicki, Marcus and Peng, Liangrui",
title="SPRINT: Script-agnostic Structure Recognition in Tables",
booktitle="Document Analysis and Recognition - ICDAR 2024",
year="2024",
publisher="Springer Nature Switzerland",
address="Cham",
pages="350--367",
isbn="978-3-031-70549-6"
url = "https://arxiv.org/abs/2503.11932"
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
resources		resources
src		src
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SPRINT: Script-agnostic Structure Recognition in Tables

Abstract

Getting Started

Installation

Inference Implementation

Methodology

SPRINT

Datasets

MUSTARD Dataset

Results

References

License

Acknowledgements

Authors Contact Information

Questions or Issues

Citation

About

Releases 1

Packages

Languages

License

IITB-LEAP-OCR/SPRINT

Folders and files

Latest commit

History

Repository files navigation

SPRINT: Script-agnostic Structure Recognition in Tables

Abstract

Getting Started

Installation

Inference Implementation

Methodology

SPRINT

Datasets

MUSTARD Dataset

Results

References

License

Acknowledgements

Authors Contact Information

Questions or Issues

Citation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages