Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in scikit-learn GaussianProcessRegressor #22

Open
dreamer2368 opened this issue Dec 21, 2024 · 1 comment
Open

Bug in scikit-learn GaussianProcessRegressor #22

dreamer2368 opened this issue Dec 21, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@dreamer2368
Copy link
Collaborator

dreamer2368 commented Dec 21, 2024

GaussianProcessRegressor exhibits an erroneous behavior depending on the data. Specifically, optimal length scale for kernel shrinks down to minimum value when fitted with the data that has a sign change. Following is the minimum failing example.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.gaussian_process.kernels import ConstantKernel, Matern, RBF
from sklearn.gaussian_process import GaussianProcessRegressor

x = np.array([[0.], [1.0]])
xpred = np.linspace(x[0][0], x[-1][0], 1000)
xpred = xpred.reshape(-1, 1)

# this is a working dataset, which does not have a sign change.
y1 = np.array([-4, -1e-3])
y1 = y1.reshape(-1, 1)

kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y1)

print(gp.kernel_.get_params()['k2__length_scale'])
yavg1, ystd1 = gp.predict(xpred, return_std = True)

plt.figure(1)
plt.plot(x, y1, 'ok')
plt.plot(xpred, yavg1, '-r')
plt.plot(xpred, yavg1 + 2 * ystd1, '--r')
plt.plot(xpred, yavg1 - 2 * ystd1, '--r')
plt.title('With (0, -4) and (1, -1e-3)')

# this is a failing dataset, which is close to the dataset above but has a sign change.
y2 = np.array([-4, 1e-3])
y2 = y2.reshape(-1, 1)

kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y2)

print(gp.kernel_.get_params()['k2__length_scale'])
yavg2, ystd2 = gp.predict(xpred, return_std = True)

plt.figure(2)
plt.plot(x, y2, 'ok')
plt.plot(xpred, yavg2, '-r')
plt.plot(xpred, yavg2 + 2 * ystd2, '--r')
plt.plot(xpred, yavg2 - 2 * ystd2, '--r')
plt.title('With (0, -4) and (1, +1e-3)')

The reason for this behavior is not known yet. Running the same example with other packages, such as GPy, does not show this behavior, so this is a bug from scikit-learn package.

So far, the only so-called 'fix' is to set the lower bound to a 'reasonable' value. However, this does not really fix the root cause and forfeits the reason of using GP for properly tuning length scale hyperparameter under statistical principle. At this point, scikit-learn is better to be replaced with other GP packages.

@dreamer2368 dreamer2368 added the bug Something isn't working label Dec 21, 2024
@dreamer2368 dreamer2368 self-assigned this Dec 21, 2024
@CBonneville45
Copy link
Collaborator

CBonneville45 commented Dec 22, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants