Skip to content

Commit 7da2a0f

Browse files
authored
MM-35: parse input against options (#31)
1 parent 383c5df commit 7da2a0f

9 files changed

+85
-94
lines changed

R/MAPIT.R

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,73 +1,73 @@
11
#' Multivariate MArginal ePIstasis Test (mvMAPIT)
2-
#'
2+
#'
33
#' \code{MvMAPIT} will run a version of the MArginal ePIstasis Test (MAPIT) under the following model variations:
4-
#'
4+
#'
55
#' This function will run a multivariate version of the MArginal ePIstasis Test (mvMAPIT).
6-
#'
6+
#'
77
#' (1) Standard Model: y = m+g+e where m ~ MVN(0,omega^2K), g ~ MVN(0,sigma^2G), e ~ MVN(0,tau^2M).
88
#' Recall from Crawford et al. (2017) that m is the combined additive effects from all other variants,
9-
#' and effectively represents the additive effect of the kth variant under the polygenic background
9+
#' and effectively represents the additive effect of the k-th variant under the polygenic background
1010
#' of all other variants; K is the genetic relatedness matrix computed using
11-
#' genotypes from all variants other than the kth; g is the summation of all pairwise interaction
12-
#' effects between the kth variant and all other variants; G represents a relatedness matrix
13-
#' computed based on pairwise interaction terms between the kth variant and all other variants. Here,
11+
#' genotypes from all variants other than the k-th; g is the summation of all pairwise interaction
12+
#' effects between the k-th variant and all other variants; G represents a relatedness matrix
13+
#' computed based on pairwise interaction terms between the k-th variant and all other variants. Here,
1414
#' we also denote D = diag(x_k) to be an n × n diagonal matrix with the genotype vector x_k as its
1515
#' diagonal elements. It is important to note that both K and G change with every new marker k that is
1616
#' considered. Lastly; M is a variant specific projection matrix onto both the null space of the intercept
1717
#' and the corresponding genotypic vector x_k.
18-
#'
18+
#'
1919
#' (2) Standard + Covariate Model: y = Wa+m+g+e where W is a matrix of covariates with effect sizes a.
20-
#'
20+
#'
2121
#' (3) Standard + Common Environment Model: y = m+g+c+e where c ~ MVN(0,eta^2C) controls for extra
2222
#' environmental effects and population structure with covariance matrix C.
23-
#'
23+
#'
2424
#' (4) Standard + Covariate + Common Environment Model: y = Wa+m+g+c+e
25-
#'
25+
#'
2626
#' This function will consider the following three hypothesis testing strategies which are featured in Crawford et al. (2017):
2727
#' (1) The Normal or Z test
2828
#' (2) Davies Method
2929
#' (3) Hybrid Method (Z test + Davies Method)
30-
#'
30+
#'
3131
#' @param X is the p x n genotype matrix where p is the number of variants and n is the number of samples. Must be a matrix and not a data.frame.
3232
#' @param Y is the d x n matrix of d quantitative or continuous traits for n samples.
3333
#' @param Z is the matrix q x n matrix of covariates. Must be a matrix and not a data.frame.
3434
#' @param C is an n x n covariance matrix detailing environmental effects and population structure effects.
35-
#' @param hybrid is a parameter detailing if the function should run the hybrid hypothesis testing procedure between the normal Z test and the Davies method. Default is TRUE.
3635
#' @param threshold is a parameter detailing the value at which to recalibrate the Z test p values. If nothing is defined by the user, the default value will be 0.05 as recommended by the Crawford et al. (2017).
3736
#' @param accuracy is a parameter setting the davies function numerical approximation accuracy. This parameter is not needed for the normal test. Smaller p-values than the accuracy will be zero.
38-
#' @param test is a parameter defining what hypothesis test should be implemented. Takes on values 'normal' or 'davies'. This parameter only matters when hybrid = FALSE. If test is not defined when hybrid = FALSE, the function will automatically use test = 'normal'.
37+
#' @param test is a parameter defining what hypothesis test should be implemented. Takes on values 'normal', 'davies', and 'hybrid'. The 'hybrid' test runs first the 'normal' test and then the 'davies' test on the significant variants.
3938
#' @param cores is a parameter detailing the number of cores to parallelize over. It is important to note that this value only matters when the user has implemented OPENMP on their operating system. If OPENMP is not installed, then please leave cores = 1 and use the standard version of this code and software.
4039
#' @param variantIndex is a vector containing indices of variants to be included in the computation.
41-
#' @param phenotypeCovariance is a string parameter defining how to model the covariance between phenotypes of effects. Possible values: 'identity', 'covariance', 'homogeneous'.
42-
#' @param logLevel is a string parameter defining the log level for the logging package.
40+
#' @param phenotypeCovariance is a string parameter defining how to model the covariance between phenotypes of effects. Possible values: 'identity', 'covariance', 'homogeneous', 'combinatorial'.
41+
#' @param logLevel is a string parameter defining the log level for the logging package.
4342
#' @param logFile is a string parameter defining the name of the log file for the logging output.
4443
#'
4544
#' @return A list of P values and PVEs
4645
#' @useDynLib mvMAPIT
4746
#' @export
4847
#' @import CompQuadForm
4948
#'
50-
MvMAPIT <- function(X,
51-
Y,
49+
MvMAPIT <- function(X,
50+
Y,
5251
Z = NULL,
53-
C = NULL,
54-
hybrid = TRUE,
52+
C = NULL,
5553
threshold = 0.05,
5654
accuracy = 1e-8,
57-
test = "normal",
58-
cores = 1,
59-
variantIndex = NULL,
60-
phenotypeCovariance = 'identity',
61-
logLevel = 'WARN',
55+
test = c('normal', 'davies', 'hybrid'),
56+
cores = 1,
57+
variantIndex = NULL,
58+
phenotypeCovariance = c('identity', 'covariance', 'homogeneous', 'combinatorial'),
59+
logLevel = 'WARN',
6260
logFile = NULL) {
6361

62+
test <- match.arg(test)
63+
phenotypeCovariance <- match.arg(phenotypeCovariance)
6464
if (cores > 1) {
6565
if (cores > detectCores()) {
6666
warning("The number of cores you're setting is larger than detected cores!")
6767
cores <- detectCores()
6868
}
6969
}
70-
70+
7171
logging::logReset()
7272
logging::basicConfig(level = logLevel)
7373
log <- logging::getLogger('MvMAPIT')
@@ -80,8 +80,9 @@ MvMAPIT <- function(X,
8080
if (is.vector(Y)) {
8181
Y <- t(Y)
8282
}
83-
83+
8484
log$debug('Running in %s test mode.', test)
85+
log$debug('Phenotype covariance: %s', phenotypeCovariance)
8586
log$debug('Genotype matrix: %d x %d', nrow(X), ncol(X))
8687
log$debug('Phenotype matrix: %d x %d', nrow(Y), ncol(Y))
8788
log$debug('Genotype matrix determinant: %f', det((X) %*% t(X)))
@@ -90,15 +91,15 @@ MvMAPIT <- function(X,
9091
X <- remove_zero_variance(X) # operates on rows
9192
log$debug('Genotype matrix after removing zero variance variants: %d x %d', nrow(X), ncol(X))
9293

93-
if (hybrid == TRUE) {
94+
if (test == 'hybrid') {
9495
vc.mod <- MAPITCpp(X, Y, Z, C, variantIndex, "normal", cores = cores, NULL, phenotypeCovariance) # Normal Z-Test
9596
pvals <- vc.mod$pvalues
9697
#row.names(pvals) <- rownames(X)
9798
pves <- vc.mod$PVE
9899
#row.names(pves) <- rownames(X)
99100
timings <- vc.mod$timings
100101
ind <- which(pvals <= threshold) # Find the indices of the p-values that are below the threshold
101-
if (is.na(phenotypeCovariance) || phenotypeCovariance == '') {
102+
if (phenotypeCovariance == 'combinatorial') {
102103
any_significance <- apply(pvals, 1, function(r) any(r <= threshold))
103104
ind_temp <- ind
104105
ind <- which(any_significance == TRUE)
@@ -108,7 +109,7 @@ MvMAPIT <- function(X,
108109
log$info('Running davies method on selected SNPs.')
109110
vc.mod <- MAPITCpp(X, Y, Z, C, ind, "davies", cores = cores, NULL, phenotypeCovariance)
110111
davies.pvals <- mvmapit_pvalues(vc.mod, X, accuracy)
111-
if (is.na(phenotypeCovariance) || phenotypeCovariance == '') {
112+
if (phenotypeCovariance == 'combinatorial') {
112113
pvals[ind_temp] <- davies.pvals[ind_temp]
113114
} else {
114115
pvals[ind] <- davies.pvals[ind]
@@ -131,7 +132,7 @@ MvMAPIT <- function(X,
131132
row.names(pves) <- rownames(X)
132133
if (length(rownames(Y)) > 0) {
133134
column_names <- mapit_struct_names(Y, phenotypeCovariance)
134-
} else if (nrow(Y) > 1 && (is.na(phenotypeCovariance) || phenotypeCovariance == '')) {
135+
} else if (nrow(Y) > 1 && (phenotypeCovariance == 'combinatorial')) {
135136
row.names(Y) <- sprintf("P%s", 1:nrow(Y))
136137
column_names <- mapit_struct_names(Y, phenotypeCovariance)
137138
} else {
@@ -148,7 +149,7 @@ remove_zero_variance <- function(X) {
148149

149150
# This naming sequence has to match the creation of the q-matrix in the C++ routine of mvMAPIT
150151
mapit_struct_names <- function (Y, phenotypeCovariance) {
151-
if (length(phenotypeCovariance) > 0 && !(phenotypeCovariance == '')) {
152+
if (length(phenotypeCovariance) > 0 && !(phenotypeCovariance == 'combinatorial')) {
152153
return(c('kronecker'))
153154
}
154155
phenotype_names <- rownames(Y)

R/RcppExports.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
22
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393
33

4-
MAPITCpp <- function(X, Y, Z = NULL, C = NULL, variantIndices = NULL, testMethod = "normal", cores = 1L, GeneticSimilarityMatrix = NULL, phenotypeCovariance = "") {
4+
MAPITCpp <- function(X, Y, Z = NULL, C = NULL, variantIndices = NULL, testMethod = "normal", cores = 1L, GeneticSimilarityMatrix = NULL, phenotypeCovariance = "identity") {
55
.Call('_mvMAPIT_MAPITCpp', PACKAGE = 'mvMAPIT', X, Y, Z, C, variantIndices, testMethod, cores, GeneticSimilarityMatrix, phenotypeCovariance)
66
}
77

man/MvMAPIT.Rd

Lines changed: 8 additions & 11 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

src/MAPIT.cpp

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,17 @@ Rcpp::List MAPITCpp(
5050
std::string testMethod = "normal",
5151
int cores = 1,
5252
Rcpp::Nullable<Rcpp::NumericMatrix> GeneticSimilarityMatrix = R_NilValue,
53-
std::string phenotypeCovariance = "") {
53+
std::string phenotypeCovariance = "identity") {
5454
int i;
5555
const int n = X.n_cols;
5656
const int p = X.n_rows;
5757
const int d = Y.n_rows;
5858
int num_combinations = 1;
5959
int z = 0;
6060

61-
const bool pairwise = (phenotypeCovariance.empty() && d > 1);
62-
if (pairwise) {
61+
const bool combinatorial =
62+
(phenotypeCovariance.compare("combinatorial") == 0 && d > 1);
63+
if (combinatorial) {
6364
num_combinations = num_combinations_with_replacement(d, 2);
6465
}
6566

@@ -75,7 +76,6 @@ Rcpp::List MAPITCpp(
7576
logger->info("Number of phenotypes: {}", d);
7677
logger->info("Test method: {}", testMethod);
7778
logger->info("Phenotype covariance model: {}", phenotypeCovariance);
78-
logger->info("mvMAPIT version pairwise: {}", pairwise);
7979

8080
#ifdef _OPENMP
8181
logger->info("Execute c++ routine on {} cores.", cores);
@@ -90,7 +90,7 @@ Rcpp::List MAPITCpp(
9090
arma::mat execution_t(p, 6);
9191

9292
int L_rows, L_cols;
93-
if (pairwise) {
93+
if (combinatorial) {
9494
L_rows = n;
9595
L_cols = num_combinations;
9696
} else {
@@ -200,7 +200,7 @@ Rcpp::List MAPITCpp(
200200
arma::mat q;
201201
std::vector<arma::vec> phenotypes;
202202
start = steady_clock::now();
203-
if (pairwise) {
203+
if (combinatorial) {
204204
phenotypes = matrix_to_vector_of_rows(Yc);
205205

206206
} else {

tests/testthat/test-mvmapit-argument-parsing.R

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ test_that("MvMapit can take a vector as phenotype input. hybrid = FALSE, test =
99
# when
1010
mapit <- MvMAPIT(t(X),
1111
Y,
12-
hybrid = FALSE,
1312
accuracy = 1e-5,
1413
cores = 1,
1514
phenotypeCovariance = 'identity',
@@ -29,7 +28,6 @@ test_that("MvMapit can take a vector as phenotype input. hybrid = FALSE, test =
2928
# when
3029
mapit <- MvMAPIT(t(X),
3130
Y,
32-
hybrid = FALSE,
3331
test = 'davies',
3432
accuracy = 1e-5,
3533
cores = 1,

0 commit comments

Comments
 (0)