-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yake Stemming? #136
Comments
@ygorg Please have a look on this. I can confirm this downside. |
Hi, |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
My team uses your yake extractor. We get pretty good results.
We would like to enable stemming. However, after studying the source code the stemming implementation seems to have two major shortcomings.
implementation
The final weight calculation with stemming doesn't consider the term-frequency. In contrast, the plain-word implementation does that. (line 358, 390) Additionally, the word-implementation also differentiates between stopwords and non-stopwords. (line 365-383)
Is it possible to offer the same great possibilities with the stemmed-version? In general, should we go with the stem or non-stem version? What would you suggest?
My team and I are grateful for any help. :)
The text was updated successfully, but these errors were encountered: