Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
abdullahau committed Mar 9, 2025
1 parent f38575e commit e4c213d
Show file tree
Hide file tree
Showing 10 changed files with 356 additions and 229 deletions.
2 changes: 1 addition & 1 deletion 01 - The Garden of Forking Data.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -1597,7 +1597,7 @@ The terms **PMF**, **PDF, CDF**, and **PPF** are fundamental concepts in probabi
**Key Differences**
| **Term** | **Full Name** | **Applies To** | **Description** |
|------------------|------------------|------------------|------------------|
|----|----|----|----|
| PMF | Probability Mass Function | Discrete random variables | Probability that a discrete random variable takes on a specific value. |
| CDF | Cumulative Distribution Function | Discrete and continuous random variables | Probability that a random variable is less than or equal to a specific value. |
| PDF | Probability Density Function | Continuous random variables | Relative likelihood of a continuous random variable taking on a specific value. |
Expand Down
2 changes: 1 addition & 1 deletion 02 - Sampling The Imaginary.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -1498,7 +1498,7 @@ model = utils.StanQuap('stan_models/m2', m2, data)
summary = model.precis().round(7)
summary
samples = model.draws()['p']
samples = model.extract_samples()['p']
# plot the posterior
def plot_pos():
Expand Down
4 changes: 2 additions & 2 deletions 03a - Geocentric Models.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -888,7 +888,7 @@ The two-element vector in the output is the list of variances. If you take the s
Okay, so how do we get samples from this multi-dimensional posterior? Now instead of sampling single values from a simple Gaussian distribution, we sample vectors of values from a multi-dimensional Gaussian distribution. The `StanQuap` object provides a convenience function to do exactly that:

```{python}
post = m4_1_model.draws()
post = m4_1_model.extract_samples()
post = pd.DataFrame(post)
post.head()
```
Expand Down Expand Up @@ -916,7 +916,7 @@ These samples also preserve the covariance between $µ$ and $σ$. This hardly ma

#### Under the hood with multivariate sampling

The `draw` and `laplace_sample` methods are for convenience. Once called, these Stan methods run simple simulation of the sort we conducted earlier. Here’s a peak at the motor. The work is done by a multi-dimensional version of `stats.norm.rvs`, `stats.multivariate_normal.rvs`. The function `stats.norm.rvs` simulates random Gaussian values, while `stats.multivariate_normal.rvs` simulates random vectors of multivariate Gaussian values. Here’s how to use it to do what `laplace_sample` does:
The `extract_samples` and `laplace_sample` methods are for convenience. Once called, these Stan methods run simple simulation of the sort we conducted earlier. Here’s a peak at the motor. The work is done by a multi-dimensional version of `stats.norm.rvs`, `stats.multivariate_normal.rvs`. The function `stats.norm.rvs` simulates random Gaussian values, while `stats.multivariate_normal.rvs` simulates random vectors of multivariate Gaussian values. Here’s how to use it to do what `laplace_sample` does:

```{python}
stats.multivariate_normal.rvs(
Expand Down
457 changes: 261 additions & 196 deletions 03b - Geocentric Models.qmd

Large diffs are not rendered by default.

69 changes: 46 additions & 23 deletions R_code.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,20 @@ class BridgeStan(bs.StanModel):
super().__init__(stan_so, data, make_args=make_args, warn=False)
class StanQuap(object):
'''
Description:
Find mode of posterior distribution for arbitrary fixed effect models and
then produce an approximation of the full posterior using the quadratic
curvature at the mode.
This command provides a convenient interface for finding quadratic approximations
of posterior distributions for models defined in Stan. This procedure is equivalent
to penalized maximum likelihood estimation and the use of a Hessian for profiling,
and therefore can be used to define many common regularization procedures. The point
estimates returned correspond to a maximum a posterior, or MAP, estimate. However the
intention is that users will use `draws` and `laplace_sample` and other methods to work
with the full posterior.
'''
def __init__(self,
stan_file: str,
stan_code: str,
Expand Down Expand Up @@ -101,8 +115,8 @@ class StanQuap(object):
draws=draws,
jacobian=self.jacobian)
def draws(self, draws: int = 100_000, dict_out: bool = True):
laplace_obj = self.laplace_sample(draws)
def extract_samples(self, n: int = 100_000, dict_out: bool = True):
laplace_obj = self.laplace_sample(n)
if dict_out:
return laplace_obj.stan_variables()
return laplace_obj.draws()
Expand Down Expand Up @@ -179,8 +193,8 @@ class StanQuap(object):
'Parameter': list(self.params.keys()),
'Mean': pos_mu,
'StDev': pos_sigma,
f'{plo:.2%}': lo,
f'{phi:.2%}': hi})
f'{plo:.1%}': lo,
f'{phi:.1%}': hi})
return res.set_index('Parameter')
```

Expand Down Expand Up @@ -422,32 +436,41 @@ library(rethinking)
data(Howell1)
d <- Howell1
d2 <- d[ d$age >= 18 , ]
plot( d2$height ~ d2$weight )
```


```{r}
# load data again, since it's a long way back
library(rethinking)
data(Howell1); d <- Howell1; d2 <- d[ d$age >= 18 , ]
# define the average weight, x-bar
xbar <- mean(d2$weight)
# fit model
m4.3 <- quap(
alist(
height ~ dnorm( mu , sigma ) ,
mu <- a + b*( weight - xbar ) ,
a ~ dnorm( 178 , 20 ) ,
b ~ dlnorm( 0 , 1 ) ,
sigma ~ dunif( 0 , 50 )
) , data=d2 )
```

```{r}
d3 <- sample( d2$height , size=20 )
mu.list <- seq( from=150, to=170 , length.out=200 )
sigma.list <- seq( from=4 , to=20 , length.out=200 )
post2 <- expand.grid( mu=mu.list , sigma=sigma.list )
post2$LL <- sapply( 1:nrow(post2) , function(i)
sum( dnorm( d3 , mean=post2$mu[i] , sd=post2$sigma[i] ,
log=TRUE ) ) )
post2$prod <- post2$LL + dnorm( post2$mu , 178 , 20 , TRUE ) +
dunif( post2$sigma , 0 , 50 , TRUE )
post2$prob <- exp( post2$prod - max(post2$prod) )
sample2.rows <- sample( 1:nrow(post2) , size=1e4 , replace=TRUE ,
prob=post2$prob )
sample2.mu <- post2$mu[ sample2.rows ]
sample2.sigma <- post2$sigma[ sample2.rows ]
plot( sample2.mu , sample2.sigma , cex=0.5 ,
col=col.alpha(rangi2,0.1) ,
xlab="mu" , ylab="sigma" , pch=16 )
precis( m4.3 )
pairs(m4.3)
```

```{r}
dens( sample2.sigma , norm.comp=TRUE )
mu <- link( m4.3 )
?link
```


```{r}
m4.3@links
```


1 change: 1 addition & 0 deletions stan_models/m4_3.stan
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ parameters {
}
model {
vector[N] mu;
// linear model
mu = a + b * (weight - xbar);

// Likelihood Function
Expand Down
Binary file added stan_models/m4_3b
Binary file not shown.
24 changes: 24 additions & 0 deletions stan_models/m4_3b.stan
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@

data {
int<lower=0> N;
vector[N] height;
vector[N] weight;
real xbar;
}
parameters {
real a;
real log_b;
real<lower=0, upper=50> sigma;
}
model {
vector[N] mu;
mu = a + exp(log_b) * (weight - xbar);

// Likelihood Function
height ~ normal(mu, sigma);

// Priors
a ~ normal(178, 20);
log_b ~ normal(0, 1);
sigma ~ uniform(0, 50);
}
Binary file added stan_models/m4_3b_model.so
Binary file not shown.
26 changes: 20 additions & 6 deletions utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,20 @@ def __init__(self, stan_file: str, data: dict, force_compile=False):
super().__init__(stan_so, data, make_args=make_args, warn=False)

class StanQuap(object):
'''
Description:
Find mode of posterior distribution for arbitrary fixed effect models and
then produce an approximation of the full posterior using the quadratic
curvature at the mode.
This command provides a convenient interface for finding quadratic approximations
of posterior distributions for models defined in Stan. This procedure is equivalent
to penalized maximum likelihood estimation and the use of a Hessian for profiling,
and therefore can be used to define many common regularization procedures. The point
estimates returned correspond to a maximum a posterior, or MAP, estimate. However the
intention is that users will use `draws` and `laplace_sample` and other methods to work
with the full posterior.
'''
def __init__(self,
stan_file: str,
stan_code: str,
Expand Down Expand Up @@ -98,8 +112,8 @@ def laplace_sample(self, draws: int = 100_000):
draws=draws,
jacobian=self.jacobian)

def draws(self, draws: int = 100_000, dict_out: bool = True):
laplace_obj = self.laplace_sample(draws)
def extract_samples(self, n: int = 100_000, dict_out: bool = True):
laplace_obj = self.laplace_sample(draws=n)
if dict_out:
return laplace_obj.stan_variables()
return laplace_obj.draws()
Expand Down Expand Up @@ -203,16 +217,16 @@ def invlogit(x: float) -> float:


def precis(samples, prob=0.89):
if isinstance(samples, dict):
if isinstance(samples, dict) or isinstance(samples, pd.Series):
samples = pd.DataFrame(samples)
plo = (1-prob)/2
phi = 1 - plo
res = pd.DataFrame({
'Parameter':samples.columns.to_numpy(),
'Mean': samples.mean().to_numpy(),
'StDev': samples.std().to_numpy(),
f'{plo:.1%}': samples.quantile(q=plo).to_numpy(),
f'{phi:.1%}': samples.quantile(q=phi).to_numpy()
'StDev': samples.std().to_numpy(),
f'{plo:.1%}': samples.quantile(q=plo).to_numpy(),
f'{phi:.1%}': samples.quantile(q=phi).to_numpy()
})
return res.set_index('Parameter')

Expand Down

0 comments on commit e4c213d

Please sign in to comment.