Skip to content

Commit

Permalink
better reflect current state of the repo
Browse files Browse the repository at this point in the history
  • Loading branch information
bertsky committed Nov 20, 2019
1 parent faaf83f commit 1aff083
Showing 1 changed file with 12 additions and 4 deletions.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,15 @@

This repository aims to provide a number of OCR-D-compliant processors for layout analysis and evaluation.

- pattern-based segmentation aka. `ocrd-segment-via-template` (input file groups N=1, based on a PAGE template, e.g. from Aletheia, and some XSLT or Python to apply it to the input file group)
- data-driven segmentation aka. `ocrd-segment-via-model` (input file groups N=1, based on a statistical model, e.g. Neural Network)
- comparing different layout segmentations aka. `ocrd-segment-evaluate` (input file groups N = 2, compute the distance between two segmentations, e.g. automatic vs. manual)
- repairing of layout segmentations aka. `ocrd-segment-repair` (input file groups N >= 1, based on heuristics implemented using Shapely)
- extracting page images (including results from preprocessing like cropping, deskewing or binarization) along with region polygon coordinates and metadata:
- [ocrd-segment-extract-regions](ocrd_segment/extract_regions.py)
- extracting line images (including results from preprocessing like cropping, deskewing, dewarping or binarization) along with line polygon coordinates and metadata:
- [ocrd-segment-extract-lines](ocrd_segment/extract_lines.py)
- comparing different layout segmentations (input file groups N = 2, compute the distance between two segmentations, e.g. automatic vs. manual):
- [ocrd-segment-evaluate](ocrd_segment/evaluate.py) :construction: (very early stage)
- repairing layout segmentations (input file groups N >= 1, based on heuristics implemented using Shapely):
- [ocrd-segment-repair](ocrd_segment/repair.py) :construction: (much to be done)
- pattern-based segmentation (input file groups N=1, based on a PAGE template, e.g. from Aletheia, and some XSLT or Python to apply it to the input file group)
- `ocrd-segment-via-template` :construction: (unpublished)
- data-driven segmentation (input file groups N=1, based on a statistical model, e.g. Neural Network)
- `ocrd-segment-via-model` :construction: (unpublished)

0 comments on commit 1aff083

Please sign in to comment.