Skip to content

Commit

Permalink
MM-35: parse input against options (#31)
Browse files Browse the repository at this point in the history
  • Loading branch information
jdstamp authored Aug 3, 2021
1 parent 383c5df commit 7da2a0f
Show file tree
Hide file tree
Showing 9 changed files with 85 additions and 94 deletions.
65 changes: 33 additions & 32 deletions R/MAPIT.R
Original file line number Diff line number Diff line change
@@ -1,73 +1,73 @@
#' Multivariate MArginal ePIstasis Test (mvMAPIT)
#'
#'
#' \code{MvMAPIT} will run a version of the MArginal ePIstasis Test (MAPIT) under the following model variations:
#'
#'
#' This function will run a multivariate version of the MArginal ePIstasis Test (mvMAPIT).
#'
#'
#' (1) Standard Model: y = m+g+e where m ~ MVN(0,omega^2K), g ~ MVN(0,sigma^2G), e ~ MVN(0,tau^2M).
#' Recall from Crawford et al. (2017) that m is the combined additive effects from all other variants,
#' and effectively represents the additive effect of the kth variant under the polygenic background
#' and effectively represents the additive effect of the k-th variant under the polygenic background
#' of all other variants; K is the genetic relatedness matrix computed using
#' genotypes from all variants other than the kth; g is the summation of all pairwise interaction
#' effects between the kth variant and all other variants; G represents a relatedness matrix
#' computed based on pairwise interaction terms between the kth variant and all other variants. Here,
#' genotypes from all variants other than the k-th; g is the summation of all pairwise interaction
#' effects between the k-th variant and all other variants; G represents a relatedness matrix
#' computed based on pairwise interaction terms between the k-th variant and all other variants. Here,
#' we also denote D = diag(x_k) to be an n × n diagonal matrix with the genotype vector x_k as its
#' diagonal elements. It is important to note that both K and G change with every new marker k that is
#' considered. Lastly; M is a variant specific projection matrix onto both the null space of the intercept
#' and the corresponding genotypic vector x_k.
#'
#'
#' (2) Standard + Covariate Model: y = Wa+m+g+e where W is a matrix of covariates with effect sizes a.
#'
#'
#' (3) Standard + Common Environment Model: y = m+g+c+e where c ~ MVN(0,eta^2C) controls for extra
#' environmental effects and population structure with covariance matrix C.
#'
#'
#' (4) Standard + Covariate + Common Environment Model: y = Wa+m+g+c+e
#'
#'
#' This function will consider the following three hypothesis testing strategies which are featured in Crawford et al. (2017):
#' (1) The Normal or Z test
#' (2) Davies Method
#' (3) Hybrid Method (Z test + Davies Method)
#'
#'
#' @param X is the p x n genotype matrix where p is the number of variants and n is the number of samples. Must be a matrix and not a data.frame.
#' @param Y is the d x n matrix of d quantitative or continuous traits for n samples.
#' @param Z is the matrix q x n matrix of covariates. Must be a matrix and not a data.frame.
#' @param C is an n x n covariance matrix detailing environmental effects and population structure effects.
#' @param hybrid is a parameter detailing if the function should run the hybrid hypothesis testing procedure between the normal Z test and the Davies method. Default is TRUE.
#' @param threshold is a parameter detailing the value at which to recalibrate the Z test p values. If nothing is defined by the user, the default value will be 0.05 as recommended by the Crawford et al. (2017).
#' @param accuracy is a parameter setting the davies function numerical approximation accuracy. This parameter is not needed for the normal test. Smaller p-values than the accuracy will be zero.
#' @param test is a parameter defining what hypothesis test should be implemented. Takes on values 'normal' or 'davies'. This parameter only matters when hybrid = FALSE. If test is not defined when hybrid = FALSE, the function will automatically use test = 'normal'.
#' @param test is a parameter defining what hypothesis test should be implemented. Takes on values 'normal', 'davies', and 'hybrid'. The 'hybrid' test runs first the 'normal' test and then the 'davies' test on the significant variants.
#' @param cores is a parameter detailing the number of cores to parallelize over. It is important to note that this value only matters when the user has implemented OPENMP on their operating system. If OPENMP is not installed, then please leave cores = 1 and use the standard version of this code and software.
#' @param variantIndex is a vector containing indices of variants to be included in the computation.
#' @param phenotypeCovariance is a string parameter defining how to model the covariance between phenotypes of effects. Possible values: 'identity', 'covariance', 'homogeneous'.
#' @param logLevel is a string parameter defining the log level for the logging package.
#' @param phenotypeCovariance is a string parameter defining how to model the covariance between phenotypes of effects. Possible values: 'identity', 'covariance', 'homogeneous', 'combinatorial'.
#' @param logLevel is a string parameter defining the log level for the logging package.
#' @param logFile is a string parameter defining the name of the log file for the logging output.
#'
#' @return A list of P values and PVEs
#' @useDynLib mvMAPIT
#' @export
#' @import CompQuadForm
#'
MvMAPIT <- function(X,
Y,
MvMAPIT <- function(X,
Y,
Z = NULL,
C = NULL,
hybrid = TRUE,
C = NULL,
threshold = 0.05,
accuracy = 1e-8,
test = "normal",
cores = 1,
variantIndex = NULL,
phenotypeCovariance = 'identity',
logLevel = 'WARN',
test = c('normal', 'davies', 'hybrid'),
cores = 1,
variantIndex = NULL,
phenotypeCovariance = c('identity', 'covariance', 'homogeneous', 'combinatorial'),
logLevel = 'WARN',
logFile = NULL) {

test <- match.arg(test)
phenotypeCovariance <- match.arg(phenotypeCovariance)
if (cores > 1) {
if (cores > detectCores()) {
warning("The number of cores you're setting is larger than detected cores!")
cores <- detectCores()
}
}

logging::logReset()
logging::basicConfig(level = logLevel)
log <- logging::getLogger('MvMAPIT')
Expand All @@ -80,8 +80,9 @@ MvMAPIT <- function(X,
if (is.vector(Y)) {
Y <- t(Y)
}

log$debug('Running in %s test mode.', test)
log$debug('Phenotype covariance: %s', phenotypeCovariance)
log$debug('Genotype matrix: %d x %d', nrow(X), ncol(X))
log$debug('Phenotype matrix: %d x %d', nrow(Y), ncol(Y))
log$debug('Genotype matrix determinant: %f', det((X) %*% t(X)))
Expand All @@ -90,15 +91,15 @@ MvMAPIT <- function(X,
X <- remove_zero_variance(X) # operates on rows
log$debug('Genotype matrix after removing zero variance variants: %d x %d', nrow(X), ncol(X))

if (hybrid == TRUE) {
if (test == 'hybrid') {
vc.mod <- MAPITCpp(X, Y, Z, C, variantIndex, "normal", cores = cores, NULL, phenotypeCovariance) # Normal Z-Test
pvals <- vc.mod$pvalues
#row.names(pvals) <- rownames(X)
pves <- vc.mod$PVE
#row.names(pves) <- rownames(X)
timings <- vc.mod$timings
ind <- which(pvals <= threshold) # Find the indices of the p-values that are below the threshold
if (is.na(phenotypeCovariance) || phenotypeCovariance == '') {
if (phenotypeCovariance == 'combinatorial') {
any_significance <- apply(pvals, 1, function(r) any(r <= threshold))
ind_temp <- ind
ind <- which(any_significance == TRUE)
Expand All @@ -108,7 +109,7 @@ MvMAPIT <- function(X,
log$info('Running davies method on selected SNPs.')
vc.mod <- MAPITCpp(X, Y, Z, C, ind, "davies", cores = cores, NULL, phenotypeCovariance)
davies.pvals <- mvmapit_pvalues(vc.mod, X, accuracy)
if (is.na(phenotypeCovariance) || phenotypeCovariance == '') {
if (phenotypeCovariance == 'combinatorial') {
pvals[ind_temp] <- davies.pvals[ind_temp]
} else {
pvals[ind] <- davies.pvals[ind]
Expand All @@ -131,7 +132,7 @@ MvMAPIT <- function(X,
row.names(pves) <- rownames(X)
if (length(rownames(Y)) > 0) {
column_names <- mapit_struct_names(Y, phenotypeCovariance)
} else if (nrow(Y) > 1 && (is.na(phenotypeCovariance) || phenotypeCovariance == '')) {
} else if (nrow(Y) > 1 && (phenotypeCovariance == 'combinatorial')) {
row.names(Y) <- sprintf("P%s", 1:nrow(Y))
column_names <- mapit_struct_names(Y, phenotypeCovariance)
} else {
Expand All @@ -148,7 +149,7 @@ remove_zero_variance <- function(X) {

# This naming sequence has to match the creation of the q-matrix in the C++ routine of mvMAPIT
mapit_struct_names <- function (Y, phenotypeCovariance) {
if (length(phenotypeCovariance) > 0 && !(phenotypeCovariance == '')) {
if (length(phenotypeCovariance) > 0 && !(phenotypeCovariance == 'combinatorial')) {
return(c('kronecker'))
}
phenotype_names <- rownames(Y)
Expand Down
2 changes: 1 addition & 1 deletion R/RcppExports.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

MAPITCpp <- function(X, Y, Z = NULL, C = NULL, variantIndices = NULL, testMethod = "normal", cores = 1L, GeneticSimilarityMatrix = NULL, phenotypeCovariance = "") {
MAPITCpp <- function(X, Y, Z = NULL, C = NULL, variantIndices = NULL, testMethod = "normal", cores = 1L, GeneticSimilarityMatrix = NULL, phenotypeCovariance = "identity") {
.Call('_mvMAPIT_MAPITCpp', PACKAGE = 'mvMAPIT', X, Y, Z, C, variantIndices, testMethod, cores, GeneticSimilarityMatrix, phenotypeCovariance)
}

19 changes: 8 additions & 11 deletions man/MvMAPIT.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions src/MAPIT.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,17 @@ Rcpp::List MAPITCpp(
std::string testMethod = "normal",
int cores = 1,
Rcpp::Nullable<Rcpp::NumericMatrix> GeneticSimilarityMatrix = R_NilValue,
std::string phenotypeCovariance = "") {
std::string phenotypeCovariance = "identity") {
int i;
const int n = X.n_cols;
const int p = X.n_rows;
const int d = Y.n_rows;
int num_combinations = 1;
int z = 0;

const bool pairwise = (phenotypeCovariance.empty() && d > 1);
if (pairwise) {
const bool combinatorial =
(phenotypeCovariance.compare("combinatorial") == 0 && d > 1);
if (combinatorial) {
num_combinations = num_combinations_with_replacement(d, 2);
}

Expand All @@ -75,7 +76,6 @@ Rcpp::List MAPITCpp(
logger->info("Number of phenotypes: {}", d);
logger->info("Test method: {}", testMethod);
logger->info("Phenotype covariance model: {}", phenotypeCovariance);
logger->info("mvMAPIT version pairwise: {}", pairwise);

#ifdef _OPENMP
logger->info("Execute c++ routine on {} cores.", cores);
Expand All @@ -90,7 +90,7 @@ Rcpp::List MAPITCpp(
arma::mat execution_t(p, 6);

int L_rows, L_cols;
if (pairwise) {
if (combinatorial) {
L_rows = n;
L_cols = num_combinations;
} else {
Expand Down Expand Up @@ -200,7 +200,7 @@ Rcpp::List MAPITCpp(
arma::mat q;
std::vector<arma::vec> phenotypes;
start = steady_clock::now();
if (pairwise) {
if (combinatorial) {
phenotypes = matrix_to_vector_of_rows(Yc);

} else {
Expand Down
2 changes: 0 additions & 2 deletions tests/testthat/test-mvmapit-argument-parsing.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ test_that("MvMapit can take a vector as phenotype input. hybrid = FALSE, test =
# when
mapit <- MvMAPIT(t(X),
Y,
hybrid = FALSE,
accuracy = 1e-5,
cores = 1,
phenotypeCovariance = 'identity',
Expand All @@ -29,7 +28,6 @@ test_that("MvMapit can take a vector as phenotype input. hybrid = FALSE, test =
# when
mapit <- MvMAPIT(t(X),
Y,
hybrid = FALSE,
test = 'davies',
accuracy = 1e-5,
cores = 1,
Expand Down
Loading

0 comments on commit 7da2a0f

Please sign in to comment.