-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comma sensitivity in sentences #18
Comments
Thanks! This is indeed the right place to submit issues such as this one, which describes a bona fide bug. It looks like we need to tweak the grammar for this syntactic structure; it is incorrectly preferring to see |
By the way, a good way to visualize the sentence trees is to enter the text into Greynir: https://greynir.is/treegrid?txt=S%C3%B3tt%20er%20um%20leyfi%20til%20a%C3%B0%20byggja%2050%20leigu%C3%ADb%C3%BA%C3%B0ir%20fyrir%20n%C3%A1msmenn%20%C3%A1%20l%C3%B3%C3%B0%20vi%C3%B0%20Austurhl%C3%AD%C3%B0. |
Great! Here is another potential bug I spotted. Málinu er vísað til umsagnar skipulagsfulltrúa vegna svala. "svala" here seems to be the noun not a balcony. Balcony is perhaps more common, maybe a grammar file tweak can help this. Not even sure what "svali" means. |
By the way, in the earlier bug ("Sótt var um leyfi...") Greynir is recognizing "Sótt" as the noun, not the verb; and then it constructs a double verb phrase with the verbs "var" and "á" hanging off the subject "Sótt". |
...and "svala" can also be a bird, i.e. a female noun, plus two masculine ones ("svalur" and "svali"). |
A fix for "svala" is ready and will be in the next commit to the config file |
I’m not familiar with the parsing pipeline but I thought I would share an instance of where the parser tripped in a (to me) surprising way:
This is fine and gives me the right lemmas for leiguíbúð, námsmaður etc.
The comma before "á lóð" gives me the "eiga" lemma for "á" instead of just "á".
Sorry if GitHub issues is the wrong place. I’m mainly curious about the roadmap, design and limitations. I assume Greynir uses commas to fragment sentences to keep down the parse pathways.
BTW this is a real world example.
Loving Greynir and following your progress! ✨
EDIT: Screenshot might help
The text was updated successfully, but these errors were encountered: