Skip to content

Black-box spike and slab variational inference, example with linear models

Notifications You must be signed in to change notification settings

ldv1/bbvi_spike_and_slab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Black-box spike and slab variational inference, example with linear models

Motivation

Duvenaud showed in Black-Box Stochastic Variational Inference in Five Lines of Python how to make use of the Python module autograd to easily code black-box variational inference introduced in Black Box Variational Inference by Ranganath et al.

I adapted his code to linear models with spike and slab prior.

Dependencies

You will need python 3 with autograd and matplotlib.

Model

A succinct description of the model can be found here.

In short: We use black-box variational inference to find an approximation to the posterior over all parameters using optimization. The spike and slab prior introduces continuous and discrete random variables. To sample from the approximate posterior, we use the Gumbel-Max trick for the discrete variables and the reparameterization trick for the continuous variables.

Results

For the generation of the dataset, we follow Bettencourt in Bayes Sparse Regression. The covariates are all independently distributed around zero with unit variance, and there is a population of both large, relevant slopes and small, irrelevant slopes. Moreover, the data are collinear, with more covariates than observations, which implies a non-identified likelihood. In details: We have M=200 covariates and N=100 noisy observations (variance of Gaussian noise is 1.). The probability that a covariate is large is Bernoulli(0.05).

Bettencourt showed that the Finnish horseshoe prior does a pretty good job. However, the model is involved and tuning parameters demands a certain level of expertise.

The spike and slab prior finds relevant slopes and set irrelevant slopes to 0, as depicted on the following figure:

Demo in 2D

In the large N regime (in the figure below N=4000), the fit is excellent:

Demo in 2D

Authors

Laurent de Vito

License

All third-party libraries are subject to their own license.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.