Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using part-of-speech matching #12

Open
waldoj opened this issue Oct 28, 2015 · 0 comments
Open

Consider using part-of-speech matching #12

waldoj opened this issue Oct 28, 2015 · 0 comments

Comments

@waldoj
Copy link
Member

waldoj commented Oct 28, 2015

It seems to me that we could get better results if we ensure that synonyms use the same part of speech as the term. For example, the top 10 most similar terms to "vehicle" are:

motor
motorcycle
self-propelled
powered
off-highway
semitrailer
vehicles
motorboat
off-road
late

But if we restrict this to nouns, we get:

motor
motorcycle
semitrailer
motorboat

That's a much better list of synonyms.

And for "improve":

enhance
achieve
restore
outcomes
low-income
replace
rehabilitate
growth
integrity
cost-effective

Whittling it down to verbs:

enhance
achieve
restore
replace
rehabilitate

Again, a much better list of synonyms.

The stumbling block here is that we don't know the part of speech for all defined terms. For example, "judge" could be a noun or a verb. But that's no reason not to use this process. The one caveat, of course, is that this necessitates having a machine-readable dictionary that we can use to determine the part of speech for each term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant