-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
detect_language_mixed(): R Session Crashing when running on empty entries #3
Comments
Can you try to create a minimal reproducible example? |
|
oh wow haha that is embarrassing |
Probably just a little slip up somewhere, haha. When I remove the empty entries it runs like a charm! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey!
I have a large dataset of mixed-language entries (assume 100k+) that I want to run cld3's language detection on in order to detect non-english language snippets. However, I was running into the problem with the R Session aborting (fatal error) as soon as I try to run it over certain entries. I could isolate the problem and it seems that as soon as it hit an empty entry ("") , it would fail and take the whole session down with it. cld2::detect_language_mixed and cld3::detect_language() both do not seem to have that issue, so I'm assuming it would be an easy fix to escape these entries and return NA. Seeing that it took me a while to figure out, it might save quite a bit of heartache to implement this in the next update though. I'm running the latest cld3 release from CRAN (1.4.1).
Also, thanks for the great package! It's really helpful seeing that it seems to deal better with multi-language entries than cld2.
The text was updated successfully, but these errors were encountered: