Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better KDE bandwidth choice #378

Open
mrshirts opened this issue Mar 28, 2020 · 1 comment
Open

better KDE bandwidth choice #378

mrshirts opened this issue Mar 28, 2020 · 1 comment
Assignees

Comments

@mrshirts
Copy link
Collaborator

Pick and implement reasonable algorithm for a default kernel density estimate choice.

@mrshirts mrshirts self-assigned this Mar 28, 2020
@mrshirts
Copy link
Collaborator Author

mrshirts commented Apr 5, 2020

See: https://en.wikipedia.org/wiki/Multivariate_kernel_density_estimation and https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection.

Silverman's rule is simple, but is probably a bad idea since PMFs are generally multimodal, and Silverman's assumes unimodal data.

Seems to be rather difficult in the general case, and not yet supported in scikit.learn. Note that for multivariate distributions, it is covariance matrix. We might consider using something like https://pythonhosted.org/PyQt-Fit/mod_kde.html or https://kdepy.readthedocs.io/en/latest/bandwidth.html, but they are not as standard. KDEpy seems a little simpler, and supports some good bandwith choices, but is not conda-installable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant