BOOST-HMI

BOOST-HMI is a generalized energy-based Bayesian framework method that is proposed to identify spatial variable (SV) genes from the imaging-based spatially resolved transcriptomic (SRT) dataset. A hidden gene expression level indicator is introduced to dichotomize the gene expression counts into two levels: highly-expressed and low-expressed. It models the gene expression counts using a zero-inflated negative binomial mixture distribution and the interaction between highly-expressed and lowly-expressed cells is characterized by a hidden Bayesian mark interaction model.

The following R packages are required to run the model

Rcpp
RcppArmadillo

Usage

The main steps of BOOST-HMI are the following:

Preparation of gene expression count file
Preparation of spatial location file
BOOST-HMI analysis
Downstream analysis

Example data

One replicate of the Mouse hippocampus seqFISH data cohort is Data/hippocampus_field_43.Rdata
The gene expression count file has the following format

	Tal1	Dmbx1	Emx2	...	fbll1
Cell 1	3	11	13	...	28
Cell 2	3	9	14	...	20
Cell 3	12	3	1	...	25
...	...	...	...	...	...
Cell 257	1	0	0	...	6

The geostatistical profile has the following format

	x	y
Cell 1	686	352
Cell 2	488	572
Cell 3	178	624
...	...	...
Cell 257	461	234

BOOST-HMI analysis

source("code/functions.R")
Rcpp::sourceCpp("code/zinbm.cpp")
load("data/hippocampus_field_43.Rdata")

# prior specification
Q <- 2
c <- 0.15
iter <- 50000
burn <- iter/2
M <- 1
# size factor
rowsum = apply(count,1,sum)
scaler = exp(-1/nrow(count) * sum(log(rowsum)))
si = scaler * rowsum
# priors for phi
a_phi = 0.01
b_phi= 100
# priors for mu
a_mu = 0.01; 
b_mu= 100;
# prior for theta
mu_theta <- 0;
sigma_theta <- 3.5; # 95.44% theta in (-4,4)
# prior for omega
mu_omega <- 1;
sigma_omega <- 0.5;

# Prior in proposal distribution
tau_mu <- 0.1 
tau_theta<- 0.1
tau_omega = 0.1
tau_phi = 0.1   
# parameter configuration
theta_start <- rnorm(1, mu_theta, sigma_theta); 
theta_start <- matrix(c(0,theta_start,theta_start,0), ncol =2)
omega_start <- rep(1,Q)
H = rep(0,nrow(count))
phi_start <- c(1,1)
mu_start <- c(0,0)

# build edges and distance
id_start <- 1;
id_end <- nrow(count);
build <- dist_list(x, y, c);
edge <- build$edge;
distance <- build$distance;
duplicate <- build$duplicate;
flag_start <- build$flag_start;
flag_end <- build$flag_end;

# Implement MCMC algorithm
re <- model_estimator(H,z_start,count[,g], edge, distance, duplicate, id_start, id_end, flag_start, flag_end, theta_start, omega_start, 
lambda_start, mu_theta, sigma_theta, mu_omega, sigma_omega, a_lambda, b_lambda, iter, burn, M, 
tau_mu, phi_lambda ,mu_start, phi_start, si,tau_lambda,tau_theta,tau_omega,tau_phi,a_phi,b_phi,a_mu,b_mu);

Output

H_sum
mu
phi
z_summed_by_row
z_colmeans
omega
theta
accept_theta
accept_omega
accept_mu
accept_phi

# calculate Bayes factor
source("code/utilities.R")
bayes_factor <- bf(re)

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
code		code
data		data
README.md		README.md
seqfish_mog.png		seqfish_mog.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BOOST-HMI

Usage

Example data

BOOST-HMI analysis

Output

SV gene visualization

About

Releases

Packages

Languages

libedeutch/BOOST-HMI

Folders and files

Latest commit

History

Repository files navigation

BOOST-HMI

Usage

Example data

BOOST-HMI analysis

Output

SV gene visualization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages