- FIX: Greek stemmer bug fix by @NC0DER in #175
- FIX: Avoid to add empty space between words and punctations. by @gianpd in #178
- DOC: Fix a few typos by @timgates42 in #182
- FEATURE: Add Arabic language support by @issam9 in #181
- FEATURE: Add support for Ukrainian language in #168
- FEATURE: Add support for the Greek Language by @NC0DER in #169
- FEATURE: Return the summary size by custom callable object in #161
- FIX: Compatibility for
from collections import Sequence
for Python 3.10 - FIX: Fix SumBasicSummarizer with stemmer in #166
- INCOMPATIBILITY Dropped official support for Python 2.7. It should still work if you install Python 2 compatible dependencies.
- FEATURE: Add basic Korean support by @kimbyungnam in #129
- FEATURE: Add support for the Hebrew language by @miso-belica in #151
- FIX: Allow words with dashes/apostrophe returned from tokenizer by @miso-belica in #144
- FIX: Ignore empty sentences from tokenizer by @miso-belica in #153
- Basic documentation by @miso-belica in #133
- Speedup of the TextRank algorithm by @miso-belica in #140
- Fix missing license in sdist by @dopplershift in #157
- added test and call for stemmer by @bdalal in #131
- Fix simple typo: referene -> reference by @timgates42 in #143
- Add codecov service to tests by @miso-belica in #136
- Add gitpod config by @miso-belica in #138
- Try to run Python 3.7 and 3.8 on TravisCI by @miso-belica in #130
- Fix TravisCI for Python 3.4 by @miso-belica in #134
- Open files for
PlaintextParser
in UTF-8 encoding #123
- Added support for Italian language #114
- Added support for ISO-639 language codes (
en
,de
,sk
, ...). #106 TextRankSummarizer
uses iterative algorithm. Previous algorithm is calledReductionSummarizer
. #100
- Added support for Chinese. #93
- Dropped support for distutils when installing sumy.
- Added support for Japanese. #79
- Fixed incorrect n-grams computation for more sentences. #84
- Fixed NLTK dependency for Python 3.3. NLTK 3.2 dropped support for Python 3.3 so sumy needs 3.1.
- Fixed missing stopwords in SumBasic summarizer. #74
- Added "--text" CLI parameter to summarize text in Emacs and other tools. #67
- Fixed computation of cosine similarity in LexRank summarizator. #63
- Fixed resource searching in .egg packages. #53
- Added support for Portuguese and Spanish. #49 #51
- Better error message when NLTK tokenizers are missing.
- Dropped support for Python 2.6 and 3.2. Only 2.7/3.3+ are officially supported now. Time to move :)
- CLI: Better message for unknown format.
- LexRank: fixed power method computation.
- Added some extra abbreviations (english, german) into tokenizer for better output.
- SumBasic: Added new summarization method - SumBasic. Thanks to Julian Griggs.
- KL: Added new summarization method - KL. Thanks to Julian Griggs.
- Added dependency requests to fix issues with downloading pages.
- Better documentation of expected Plaintext document format.
- Added possibility to specify format of input document for URL & stdin. Thanks to @Lucas-C.
- Added possibility to specify custom file with stop-words in CLI. Thanks to @Lucas-C.
- Added support for French language (added stopwords & stemmer). Thanks to @Lucas-C.
- Function
sumy.utils.get_stop_words
raisesLookupError
instead ofValueError
for unknown language. - Exception
LookupError
is raised for unknown language of stemmer instead of falling silently tonull_stemmer
.
- Fixed installation of my own readability fork. Added
breadability
to the dependencies instead of it #8. Thanks to @pratikpoddar.
- First public release.