Skip to content

This repository contains Ancient Greek texts after tokenization, sentence split, and POS-tagging

Notifications You must be signed in to change notification settings

gcelano/POStaggedAncientGreekXML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 

Repository files navigation

POS-tagged AncientGreek texts (v1.2.0)

This repository contains the POS-tagged (CTSized) texts of the Ancient Greek Literature contained in the following releases:

The POS-tagging has been generated (completely) automatically by using the MATE tagger, which has been trained on the Perseus treebank data:

The tagger achieved an accuracy of 88%. More details can be found in the article:

Changes

Release 1.2.0:

  • New Texts have been added. Tokenization now detects capital letter abbreviations such as ΑΠΟΛ.

Release 1.1.0:

  • Correction to the cts-urn structure by considering the elements seg and p (currently div, seg, p, and l are considered)

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

About

This repository contains Ancient Greek texts after tokenization, sentence split, and POS-tagging

Resources

Stars

Watchers

Forks

Packages

No packages published