Skip to content

Detect the tables in a form and extract the tables as well as the cells of the tables.

License

Notifications You must be signed in to change notification settings

arnavdutta/Table-Detection-Extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table Detection & Extraction From The Forms


Functionality:

  • Detects all the tables in a form page.
  • Create bounding boxes around it.
  • Segment it out and extract the cells of the tables.

Steps:

  1. Grayscale the image
  2. Binary Thresholding
  3. Get all the vertical lines using vertical kernel and cv2.getStructuringElement
  4. Similarly, get all the horizontal lines using horizontal kernel and cv2getStructuringElement
  5. Combine all the horizontal and vertical lines using cv2.addWeighted
  6. Perform some morphological transformation like cv2.erode to get crisp lines & for better results.
  7. Finding the contours and extracting out the rectangles/table cells.

Prerequisites

  1. Python v3.6
  2. OpenCV v3.4 import cv2
  3. Numpy v1.16 import numpy as np
  4. OS import os

About

Detect the tables in a form and extract the tables as well as the cells of the tables.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages