FUNSD-polygon-dataset

This tool generates an augmented, polygon-based, dataset from the Form Understanding in Noisy Scanned Documents dataset.

The resulting dataset will contain the same annotations, but the images will be augmented with a set of transformations such as perspective, brightness, contrast, etc. Plus adding random lines and "reflections".

The idea is to generate a dataset that mimics real world images taken by users with their mobile phones, instead of an scanned document dataset. For that, this dataset's annotations are polygon-based, instead of the usual bounding boxes. This means that instead of having two coordinates, we have four.

A more detailed description of the annotation properties can be found here.

How to use

Requirements

Install the requirements with pip install -r requirements.txt

Download the dataset

Run download_FUNSD.py to download the dataset. This will download the dataset to the datasets folder.

Transform coordinates to polygons

Run coord-to-poly.py to generate the polygon-based annotations. This will generate a new folder called datasets/FUNSD_polygon.

Generate the augmented dataset

Run main.py to generate the augmented dataset. This will generate a new folder called datasets/FUNSD_polygon_augmented.

An optional parameter --multiplier is available to control the number of augmented images that will be generated for each input image. Defaults to 5.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docs		docs
media		media
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
coord-to-poly.py		coord-to-poly.py
download_FUNSD.py		download_FUNSD.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FUNSD-polygon-dataset

How to use

Requirements

Download the dataset

Transform coordinates to polygons

Generate the augmented dataset

About

Releases

Packages

Languages

License

danngalann/FUNSD-polygon-dataset

Folders and files

Latest commit

History

Repository files navigation

FUNSD-polygon-dataset

How to use

Requirements

Download the dataset

Transform coordinates to polygons

Generate the augmented dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages