-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Norvig spell corrector #133
Comments
As a side note / hint to spell checking: just stumbled over the ropensci/hunspell package. Have not digged into the details of the implementation, but the basic idea is that it checks which affixes and word stems are allowed in a certain language and checks a text against the entries in a dictionary (which can be taken, e.g., from LibreOffice) - more details in package doc, e.g., in hunspell.R. Hence, if my understanding is correct, the hunspell approach is less probabilistic than the one of Norvig, which allows to easily use own training data, but might still be useful depending on the task to be solved since existing dictionaries can directly be used. Might be worth comparing the quality of results between the both approaches (if anyone finds the time...). |
Taken from #73:
The text was updated successfully, but these errors were encountered: