NP Regression Model w/ LIG Acquisition #2683

eibarolle · 2025-01-18T11:16:01Z

Motivation

This pull request adds a Neural Process Regression Model with a Latent Information Gain acquisition function for BoTorch functionality.

Have you read the Contributing Guidelines on pull requests?

Yes, and I've followed all the steps and testing.

Test Plan

I wrote my own unit tests for both the model and acquisition function, and all of them passed. The test files are in the appropriate folder. In addition, I ran the pytests on my files, and all of them succeeded for those files.

Suggested change

if n == 1:

eps = torch.autograd.Variable(logvar.data.new(self.z_dim).normal_()).to(device)

else:

eps = torch.autograd.Variable(logvar.data.new(n,self.z_dim).normal_()).to(device)

shape = [n, self.z_dim]

if n == 1:

shape = shape[1:]

eps = torch.autograd.Variable(logvar.data.new(*shape).normal_()).to(device)

This is a bit more concise

sdaulton · 2025-01-22T18:55:14Z

botorch_community/models/np_regression.py

+        mu: torch.Tensor,
+        logvar: torch.Tensor,
+        n: int = 1,
+        min_std: float = 0.1,


This default seems high, no?

sdaulton · 2025-01-22T18:56:14Z

botorch_community/models/np_regression.py

+    def load_state_dict(
+        self, 
+        state_dict: dict, 
+        strict: bool = True
+    ) -> None:
+        """
+        Initialize the fully Bayesian model before loading the state dict.
+
+        Args:
+            state_dict (dict): A dictionary containing the parameters.
+            strict (bool): Case matching strictness.
+        """
+        super().load_state_dict(state_dict, strict=strict)


Not needed, since it just call's the parent class's method

sdaulton · 2025-01-22T18:58:27Z

botorch_community/models/np_regression.py

+        ind = np.arange(x.shape[0])
+        mask = np.random.choice(ind, size=n_context, replace=False)
+        x_c = torch.from_numpy(x[mask])
+        y_c = torch.from_numpy(y[mask])
+        x_t = torch.from_numpy(np.delete(x, mask, axis=0))
+        y_t = torch.from_numpy(np.delete(y, mask, axis=0))


Any reason to not do this in pytorch?

sdaulton · 2025-01-22T19:01:51Z

botorch_community/acquisition/latent_information_gain.py

+import torch
+#reference: https://arxiv.org/abs/2106.02770 
+
+class LatentInformationGain:


Can we implement this as a subclass of Acquisition, so that we can use it more organically in botorch? Likely the context would be needed to be provided in LatentInformationGain.__init__

eibarolle · 2025-01-25T14:51:45Z

I applied the suggested changes to my new code. Note that for the decoder, the other functions are designed around its current state.

Balandat · 2025-01-27T17:05:28Z

@hvarfner curious if you have any thoughts on this PR re the information gain aspects

hvarfner · 2025-01-27T18:44:27Z

Interesting! I'll check out the paper quickly and get back to you

hvarfner · 2025-01-28T20:10:29Z

test_community/acquisition/test_latent_information_gain.py

+
+        self.acquisition_function.num_samples = 20
+        lig_2 = self.acquisition_function.forward(candidate_x=self.candidate_x)
+        self.assertTrue(lig_2.item() < lig_1.item())


Why would this unit test necessarily be a good check? I am not sure on the details, but it seems to me that this just improves the accuracy of the acquisition computation.

hvarfner · 2025-01-28T20:26:44Z

botorch_community/acquisition/latent_information_gain.py

+        self.context_x = context_x.to(device)
+        self.context_y = context_y.to(device)
+
+    def forward(self, candidate_x):


It seems to me like the acquisition function computes the information gain on one batch of points, with the batch being in the first dimension. Thus, the output of this forward would be one scalar.

This would run contrary to the acquisition function convention, and so it wouldn't be able to be used with optimize_acqf etc.

You would want the forward to be able to handled a N x q x D-shaped input, where you are currently only computing the q-element (but that aspect seems correct as far as I can tell right now). This may be a bit challenging, but certainly looks doable!

I think this a pretty cool idea that could be useful generally in latent space models, so it would be nice some fairly general naming for the encoding steps if this were to be used with other encoder-decoder based architectures.

eibarolle · 2025-02-02T13:48:15Z

The Latent Information Gain forward function has been updated with the correct dimensions, with the test cases adjusted as needed.

eibarolle · 2025-02-10T09:38:38Z

Notify me when you can check over the new Latent Information Gain function. @hvarfner

eibarolle · 2025-02-17T22:32:05Z

Any updates @hvarfner?

hvarfner · 2025-02-18T09:48:55Z

Hi @eibarolle,

Thanks for pinging, and sorry for being slow in this. I've taken a good look at it now.

Some issues pointed out by @sdaulton (MAE etc.) are still there or are unaddressed, so it would be nice to have these addressed. Moreover, the code would need to be formatted (see CONTRIBUTING.md).

On my previous point, the acquisition function should return a tensor of shape N, unless the model is an ensemble. Thus, the q-dimension should be reduced over. In this case, it should probably amount to a sum(dim=-1), but that is just my educated guess.

Moreover, It is not obvious how run the code deviates substantially from other BoTorch models (e.g. SingleTaskGP). Thus, I think this PR would have to come with an example (e.g. a tutorial) of the intented usage. Moreover, the code should adhere as much as possible to the standard BoTorch interfaces that are in place, ideally so that one can swap an STGP for an NPR (with mostly all the same arguments otherwise). As far as I can tell, this should be doable. The same goes for the training of the model, which you aptly pointed out in a previous commit here! Specifically, here are the obvious deviations from the convention that should be addressed:

The model should take train_X and (optionally) train_Y, but not much more than that
The acquisition function should take the model, but not the training data
The model should be trained with a BoTorch optimizer (probably fit_gpytorch_mll_torch in your case)

It seems like you have done it differently, where the model takes NN parameters (these should, if anything, be kwargs with sensible defaults, like here). Moreover, you have the acquisition function take the train_X and train_Y, if I am not mistaken. The idea is to have NPR work with other acquisition functions, and to have LIG work with other models as long as they have a latent space with a similar method to call for the latent space predictions.

I recognize that this is a lot of work, but adhering to this standard ultimately ensures the usefulness of your code.

Now, I am not quite sure what the bar is for community contributions, so it would be good to get e.g. @sdaulton 's or @Balandat 's take on this as well.

eibarolle · 2025-02-18T22:51:13Z

Got it, thanks for the advice and pointers. I'll make the needed additions/examples, and I'll contact about any questions.

eibarolle added 5 commits January 18, 2025 02:55

Latent Information Gain

9304881

NP Regression

7a8e4ab

Test NP Regression

5f0dba0

Test Latent Information Gain

c55d7a9

NP Regression Documentation

657151f

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 18, 2025

sdaulton reviewed Jan 22, 2025

View reviewed changes

eibarolle added 5 commits January 25, 2025 06:46

1/25 Updates

4f35e0f

1/25 Updates

280776d

1/25 Updates

a811429

1/25 Updates

50fe7a1

Merge branch 'main' into np_regression

9684e1d

hvarfner reviewed Jan 28, 2025

View reviewed changes

eibarolle added 2 commits February 2, 2025 05:46

Update Acquisition Dimensions

4aeeeeb

Updated Test Files

be34f60

+              import torch
+              import torch.nn as nn
+              import matplotlib.pyplot as plts
+              # %matplotlib inline

+              from sklearn.gaussian_process import GaussianProcessRegressor
+              from sklearn.gaussian_process.kernels import (RBF, Matern, RationalQuadratic,
+                                                            ExpSineSquared, DotProduct,
+                                                            ConstantKernel)

-from sklearn.gaussian_process import GaussianProcessRegressor
-from sklearn.gaussian_process.kernels import (RBF, Matern, RationalQuadratic,
-                                              ExpSineSquared, DotProduct,
-                                              ConstantKernel)

+                                                            ConstantKernel)
+              from typing import Callable, List, Optional, Tuple
+              from torch.nn import Module, ModuleDict, ModuleList
+              from sklearn import preprocessing

+              from typing import Callable, List, Optional, Tuple
+              from torch.nn import Module, ModuleDict, ModuleList
+              from sklearn import preprocessing
+              from scipy.stats import multivariate_normal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NP Regression Model w/ LIG Acquisition #2683

NP Regression Model w/ LIG Acquisition #2683

eibarolle commented Jan 18, 2025

sdaulton commented Jan 22, 2025

sdaulton left a comment

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

sdaulton Jan 22, 2025

eibarolle commented Jan 25, 2025

Balandat commented Jan 27, 2025

hvarfner commented Jan 27, 2025

hvarfner Jan 28, 2025

hvarfner Jan 28, 2025

hvarfner Jan 28, 2025 •

edited

Loading

eibarolle commented Feb 2, 2025

eibarolle commented Feb 10, 2025 •

edited

Loading

eibarolle commented Feb 17, 2025

hvarfner commented Feb 18, 2025

eibarolle commented Feb 18, 2025

+              from scipy.stats import multivariate_normal
+              from gpytorch.distributions import MultivariateNormal
+              device = torch.device("cpu")

+                      if n == 1:
+                          eps = torch.autograd.Variable(logvar.data.new(self.z_dim).normal_()).to(device)
+                      else:
+                          eps = torch.autograd.Variable(logvar.data.new(n,self.z_dim).normal_()).to(device)

NP Regression Model w/ LIG Acquisition #2683

Are you sure you want to change the base?

NP Regression Model w/ LIG Acquisition #2683

Conversation

eibarolle commented Jan 18, 2025

Motivation

Have you read the Contributing Guidelines on pull requests?

Test Plan

Related

sdaulton commented Jan 22, 2025

sdaulton left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eibarolle commented Jan 25, 2025

Balandat commented Jan 27, 2025

hvarfner commented Jan 27, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hvarfner Jan 28, 2025 • edited Loading

Choose a reason for hiding this comment

eibarolle commented Feb 2, 2025

eibarolle commented Feb 10, 2025 • edited Loading

eibarolle commented Feb 17, 2025

hvarfner commented Feb 18, 2025

eibarolle commented Feb 18, 2025

hvarfner Jan 28, 2025 •

edited

Loading

eibarolle commented Feb 10, 2025 •

edited

Loading