Skip to content

Sensitive to noise #130

Answered by pemistahl
bruunand asked this question in Q&A
Discussion options

You must be logged in to vote

Hi Anders, thanks for trying my library and for reaching out to me.

I'm afraid, there is nothing you can do about this. The language detector is sensitive to noise especially for very short texts because there are not enough distinct ngrams available for calculating a reliable language estimate. If the sentence was longer with the word SIARxKnru3t being the only noisy one, the detector would surely output English. You should try to filter out the noise before trying to detect the language.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by bruunand
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #129 on April 27, 2022 07:57.