-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ambiguous features on Swedish verbs #6
Comments
Just a quick note that the formatting script for Swedish verbs has moved to src/scribe_data/extract_transform/Swedish/verbs/format_verbs.py. |
Updated, @Ainali! Thank you 🙏 |
I'll be using this query to clean up most of the data errors. Around 80-85% of the results there should be split into two separate lexemes. The rest are cases where there really are two acceptable forms. However, in several of these, one of the forms is not modern and should be marked as such. The query should probably check for language style (P6191) and filter some values. |
This is so epic, @Ainali 😊 Thanks so much! Would be happy to talk with you after the hackathon about what changes need to happen to the query. After a check in I can try to make the changes, or we can do a quick call to talk over what needs to change. Whatever works best for you :) Really happy to have this issue getting some love! |
I have now split all the ones that needed to be split into different lexemes. The ones that are left (21 in the query above) are probably mostly synonyms, but I have asked around to see if there is something grammatical that could be added to them to highlight any eventual nuance between them. |
I was thinking about messaging you about this 😊 Really thanks so much for your efforts! Do I need to change anything in the query, or can I just run the normal update process? We still have some minor bug fixes for autocomplete and will add in a basic autosuggest prior to the next release, but we should have it out by say the end of next week :) |
For now, it will just be an improvement if you run the normal update process. But I think we should keep the issue open to figure out the last remaining part. |
Sounds great, thanks @Ainali :) |
Terms
Languages
Swedish
Description
Many Swedish verbs have ambiguous features that don't allow their conjugations to be properly classified. Specifically, there are doubles of many feature sets, as can be seen on the Wikidata page for the verb överge. These duplicates should be distinguished, and the formatting script for Swedish verbs should be updated, as it is now written to remove any verb that has a duplicate value caused by ambiguous features.
The text was updated successfully, but these errors were encountered: