Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Pharaoh format for parallel corpus alignment #4706

Open
fishfree opened this issue Apr 10, 2024 · 1 comment
Open

Support Pharaoh format for parallel corpus alignment #4706

fishfree opened this issue Apr 10, 2024 · 1 comment
Labels
⭐️ Enhancement New feature or request

Comments

@fishfree
Copy link

For example, awesome-align supports generating word by word parallel corpus alignment, i.e. the Pharaoh format files.
Or even can we achieve this in the current latest INCEpTION with some workarounds?

And in this paper, there is a related screenshot:
image

@reckart
Copy link
Member

reckart commented Apr 11, 2024

INCEpTION presently supports the annotation of spans and relations between spans, but not relations between relations - thus dependency parse trees can be represented by not syntactic parse trees.

There is currently no support for parallel annotated texts in INCEpTION. However, you might be able to already implement some support for aligned texts:

  • define a custom XML format that allows representing pairs of sentences and which is associated with a style sheet that renders them in an appropriate manner.
  • possibly implement an editor plugin that allows creating/editing alignment relations between the words and that also allows creating normal annotations on the words - but ensures that annotations are not in such a way that they go across sentence boundaries or across language boundaries.

@reckart reckart added this to the ⭐️ Feature backlog milestone Apr 11, 2024
@reckart reckart added the ⭐️ Enhancement New feature or request label Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐️ Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants