Skip to content

Predicting Absolute and Relative Abundance by Modeling Efficiency to Derive Intervals and Concentrations

License

Unknown, BSD-3-Clause licenses found

Licenses found

Unknown
LICENSE
BSD-3-Clause
LICENSE.md
Notifications You must be signed in to change notification settings

statdivlab/paramedic

Repository files navigation

R/paramedic: Predicting Absolute and Relative Abundance by Modeling Efficiency to Derive Intervals and Concentrations

Build Status codecov License: BSD-3

Software author: Brian Williamson

Methodology authors: Brian Williamson, Jim Hughes and Amy Willis


Introduction

paramedic is a R package for estimating microbial concentration. paramedic uses information from 16S count data (compositional data on all taxa) and absolute data on a subset of taxa (e.g., qPCR or flow cytometry) to estimate the absolute abundance of all taxa. The method accounts for differing taxon detection efficiencies between the two methods, and produces prediction and confidence intervals as well as point estimates of the absolute abundance of all taxa. Check out the paper (also available here) for more details.


R installation

You may install a stable release of paramedic from GitHub via devtools by running the following code (replace v0.1.0 with the tag for the specific release you wish to install, and only install devtools if needed):

install.packages("devtools")
devtools::install_github(repo = "statdivlab/[email protected]")

You may install a development release of paramedic from GitHub via devtools by running the following code:

install.packages("devtools")
devtools::install_github(repo = "statdivlab/paramedic")

Example

This example shows how to use paramedic in a simple setting with data inspired by the University of Washington (UW) Sexually Transmitted Infections Cooperative Research Center (STICRC). For more examples and detailed explanation, please see the vignette.

## load required functions and libraries
library("rstan")
library("paramedic")

## read in the data
data(example_16S_data)
data(example_qPCR_data)

# sample hyperparameter values
sigma_beta <- 1
sigma_Sigma <- 1
alpha_sigma <- 2
kappa_sigma <- 1

## -------------------------------------------------------------
## Estimate concentrations for the first 7 taxa
## -------------------------------------------------------------
## this is a small number of iterations, only for illustration
## also, shows how to use control parameters for rstan::stan
stan_mod <- run_paramedic(W = example_16S_data[, 1:8], V = example_qPCR_data,
                      sigma_beta = sigma_beta, sigma_Sigma = sigma_Sigma,
                      alpha_sigma = alpha_sigma, kappa_sigma = kappa_sigma,
                      n_iter = 30, n_burnin = 25, n_chains = 1, stan_seed = 4747,
                      control = list(adapt_delta = 0.85, max_treedepth = 15),
                      verbose = FALSE)
stan_mod_summ <- summary(stan_mod$stan_fit, probs = c(0.025, 0.975))$summary
stan_mod_samps <- extract(stan_mod$stan_fit)

## -------------------------------------------------------------
## Extract posterior estimates for the taxon missing abolute abundance data
## -------------------------------------------------------------
posterior_summaries <- extract_posterior_summaries(stan_mod_summ, stan_mod_samps,
                                                   taxa_of_interest = 7, mult_num = 1,
                                                   level = 0.95, interval_type = "wald")

## -------------------------------------------------------------
## Print out the posterior mean efficiencies and concentrations
## -------------------------------------------------------------
posterior_summaries$estimates
posterior_summaries$est_efficiency

Citation

After using the paramedic package, please cite the following:

  @article{williamson2020,
    author={Williamson, BD and Hughes, JP and Willis, AD},
    title={A multi-view model for relative and absolute microbial abundances},
    journal={Biometrics},
    year={2021},
    note={doi.org/10.1111/biom.13503}
  }

Issues

We use Stan to fit the hierarchical model to the data. Read any warning messages returned by the algorithm carefully, as these can help diagnose convergence issues. Carefully choosing initialisation values can speed up convergence.

If you encounter any bugs or have any specific feature requests, please file an issue. If your feature request relates to the hierarchical models implemented in paramedic, please read the hierarchical models vignette prior to filing an issue or creating a pull request.