-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in scikit-learn GaussianProcessRegressor #22
Labels
bug
Something isn't working
Comments
This is actually not a bug and isn't due to the sign of the data (in general). It is an unfortunate but quite common instability inherent to Gaussian processes. In GP inference and training, you need at some point to invert a kernel matrix (this is all done under the hood in sklearn and most common GP packages), but this matrix can be poorly conditioned depending on your data. To make the GP stable, a common way around is to add jitter on the kernel matrix diagonal. "Physically" this jitter is equivalent to the inherent noise in the data (so for virtually noiseless data you should be able to get away with adding any jitter). In sklearn you can manually tune the jitter with the alpha parameter. With a larger alpha value you shouldn't observe the instability anymore (but with a value too large, the GP is essentially gonna assume that the data is all noise and underfit)
…________________________________
De : dreamer2368 ***@***.***>
Envoyé : Saturday, December 21, 2024 7:15:21 AM
À : LLNL/GPLaSDI ***@***.***>
Cc : Subscribed ***@***.***>
Objet : [LLNL/GPLaSDI] Bug in scikit-learn GaussianProcessRegressor (Issue #22)
GaussianProcessRegressor exhibits an erroneous behavior depending on the data. Specifically, optimal length scale for kernel shrinks down to minimum value when fitted with the data that has a sign change. Following is the minimum failing example.
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.gaussian_process.kernels import ConstantKernel, Matern, RBF
from sklearn.gaussian_process import GaussianProcessRegressor
x = np.array([[0.], [1.0]])
xpred = np.linspace(x[0][0], x[-1][0], 1000)
xpred = xpred.reshape(-1, 1)
y1 = np.array([-4, -1e-3])
y1 = y1.reshape(-1, 1)
kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y1)
print(gp.kernel_.get_params()['k2__length_scale'])
yavg1, ystd1 = gp.predict(xpred, return_std = True)
plt.figure(1)
plt.plot(x, y1, 'ok')
plt.plot(xpred, yavg1, '-r')
plt.plot(xpred, yavg1 + 2 * ystd1, '--r')
plt.plot(xpred, yavg1 - 2 * ystd1, '--r')
plt.title('With (0, -4) and (1, -1e-3)')
y2 = np.array([-4, 1e-3])
y2 = y2.reshape(-1, 1)
kernel = ConstantKernel() * RBF(length_scale_bounds = (1e-5, 1e5))
gp = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 100, random_state = 1, alpha=1e-10)
gp = gp.fit(x, y2)
print(gp.kernel_.get_params()['k2__length_scale'])
yavg2, ystd2 = gp.predict(xpred, return_std = True)
plt.figure(2)
plt.plot(x, y2, 'ok')
plt.plot(xpred, yavg2, '-r')
plt.plot(xpred, yavg2 + 2 * ystd2, '--r')
plt.plot(xpred, yavg2 - 2 * ystd2, '--r')
plt.title('With (0, -4) and (1, +1e-3)')
The reason for this behavior is not known yet. Running the same example with other packages, such as GPy, does not show this behavior, so this is a bug from scikit-learn package.
So far, the only so-called 'fix' is to set the lower bound to a 'reasonable' value. However, this does not really fix the root cause and forfeits the reason of using GP for properly tuning length scale hyperparameter under statistical principle. At this point, scikit-learn is better to be replaced with other GP packages.
—
Reply to this email directly, view it on GitHub<#22>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ALEYFFYRZGEMYHLVLRHXPCD2GWAYTAVCNFSM6AAAAABUAWUEWWVHI2DSMVQWIX3LMV43ASLTON2WKOZSG42TIMJTGUZTGMY>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
GaussianProcessRegressor
exhibits an erroneous behavior depending on the data. Specifically, optimal length scale for kernel shrinks down to minimum value when fitted with the data that has a sign change. Following is the minimum failing example.The reason for this behavior is not known yet. Running the same example with other packages, such as
GPy
, does not show this behavior, so this is a bug fromscikit-learn
package.So far, the only so-called 'fix' is to set the lower bound to a 'reasonable' value. However, this does not really fix the root cause and forfeits the reason of using GP for properly tuning length scale hyperparameter under statistical principle. At this point,
scikit-learn
is better to be replaced with other GP packages.The text was updated successfully, but these errors were encountered: