Yake Stemming? #136

ace-kay-law-neo · 2020-07-14T19:13:37Z

My team uses your yake extractor. We get pretty good results.

We would like to enable stemming. However, after studying the source code the stemming implementation seems to have two major shortcomings.
implementation

The final weight calculation with stemming doesn't consider the term-frequency. In contrast, the plain-word implementation does that. (line 358, 390) Additionally, the word-implementation also differentiates between stopwords and non-stopwords. (line 365-383)

Is it possible to offer the same great possibilities with the stemmed-version? In general, should we go with the stem or non-stem version? What would you suggest?

My team and I are grateful for any help. :)

janandreschweiger · 2020-07-16T06:41:56Z

@ygorg Please have a look on this. I can confirm this downside.

ygorg · 2021-01-08T13:31:43Z

Hi,
I tried some things. And I need to ensure the performance are the same as before, so I need to find a way to prove how the updated code is backwards compatible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yake Stemming? #136

Yake Stemming? #136

ace-kay-law-neo commented Jul 14, 2020

janandreschweiger commented Jul 16, 2020

ygorg commented Jan 8, 2021

Yake Stemming? #136

Yake Stemming? #136

Comments

ace-kay-law-neo commented Jul 14, 2020

janandreschweiger commented Jul 16, 2020

ygorg commented Jan 8, 2021