Skip to content

Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies

Notifications You must be signed in to change notification settings

ftyers/ud-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

26236d6 · May 18, 2019

History

64 Commits
Apr 12, 2016
Mar 25, 2017
Mar 25, 2017
Jun 24, 2018
Apr 4, 2019
Jan 3, 2018
Apr 12, 2016
Mar 25, 2017
Mar 25, 2017
Mar 25, 2017
Feb 5, 2018
Dec 7, 2017
Sep 19, 2017
Jan 25, 2018
Apr 20, 2017
Jun 24, 2018
Apr 1, 2018
May 18, 2019
Apr 20, 2017
Feb 3, 2017
Mar 25, 2017
Apr 17, 2017
Sep 16, 2015
Jun 24, 2018
Jan 22, 2019
Apr 16, 2017
Apr 19, 2017
Feb 15, 2017
Apr 2, 2019
Jun 30, 2015
Mar 25, 2017
Feb 15, 2017
Sep 19, 2017
Dec 7, 2017
Sep 19, 2017
Jul 1, 2015
Oct 18, 2016

Repository files navigation

ud-scripts

Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies

conllu-voting:

Run Chu-Liu-Edmonds over a graph of CoNLL-U files

conllu-eval:

Calculate UAS and LAS against a gold standard.

conllu-to-tikzdep.py:

Convert CoNLL-U to TiKZdependency graphs. One file per input sentence.

conllu-to-matxin.py:

Convert CoNLL-U to Matxin XML format

matxin-to-conllu.py:

Convert Matxin XML format to CoNLL-U

conllu-feats.py:

Replace lem-pos-feats in some other format to UD using a 6- or 8-column rule file.

conllu-trim.py:

Remove double blank lines between sentences

vislcg3-flatten.sh:

Flatten VISL-CG3 output, replaces subreadings with null surface tokens (well, actually '*').

vislcg3-to-conllx-input.py:

Convert VISL-CG3 output to ConLL-X format.

vislcg3-split-space.py:

Split multiword tokens where the number of spaces in the surface form and lemma are the same into two tokens.

About

Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages