Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v0.7.1.9000] Gene based analysis option #40

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
c6bd774
not sure
Chris1221 Jul 12, 2016
183288e
merge
Chris1221 Jul 12, 2016
d2e97c4
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Jul 12, 2016
71cc5de
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Jul 12, 2016
d0e1753
this is wrong?
Chris1221 Jul 22, 2016
c3ea7ac
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Jul 22, 2016
5d18977
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Jul 22, 2016
7eee77a
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Jul 26, 2016
b84faf8
revert
Chris1221 Aug 19, 2016
2de9f4a
add in genes
Chris1221 Aug 20, 2016
61d26bf
add in gene mode #38
Chris1221 Aug 21, 2016
c5bcbce
add futile logger to @Imports
Chris1221 Aug 21, 2016
22eea24
Merge branch 'devel' of https://github.com/Chris1221/coR-ge into devel
Chris1221 Aug 22, 2016
43993ec
from cluster
Chris1221 Aug 22, 2016
788baa6
use LD with genes
Chris1221 Aug 22, 2016
8f15f52
update writing of file, maybe fix problem? probably not
Chris1221 Aug 22, 2016
d72b324
update writing of file, maybe fix problem? probably not
Chris1221 Aug 22, 2016
ef63f8e
update writing of file, maybe fix problem? probably not
Chris1221 Aug 22, 2016
632b025
fix passing of kb for genes
Chris1221 Aug 24, 2016
54544ef
revert
Chris1221 Aug 24, 2016
0e1a1b0
update readme
Chris1221 Aug 25, 2016
4494803
remove legacy code
Chris1221 Aug 25, 2016
bbf8b77
document and update onload function
Chris1221 Aug 25, 2016
74c2bdc
skip gene test for now
Chris1221 Aug 25, 2016
882baa0
commit from cluster
Chris1221 Oct 10, 2016
0471bd6
cluster
Chris1221 Oct 10, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: coRge
Type: Package
Title: An R package for the Examination of Multiple Correction Methodologies in Accurate Genomic Environments.
Version: 0.6.7.9000
Version: 0.7.0.9000
Date: 2016-21-07
Author: Christopher B. Cole
Maintainer: Christopher B. Cole <[email protected]>
Expand All @@ -12,6 +12,7 @@ Imports: Rcpp (>= 0.11.3),
foreach,
devtools,
lazyeval,
futile.logger,
data.table,
magrittr,
dplyr
Expand Down
4 changes: 1 addition & 3 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,16 @@ export(ld_cor)
export(phen)
export(rand)
export(rand0)
export(rebuild)
export(returnLD)
export(sim.gen)
export(stratify)
export(sub_job)
export(th)
import(devtools)
import(foreach)
importFrom(Rcpp,compileAttributes)
import(futile.logger)
importFrom(Rcpp,evalCpp)
importFrom(data.table,fread)
importFrom(devtools,document)
importFrom(dplyr,"%>%")
importFrom(dplyr,filter)
importFrom(dplyr,filter_)
Expand Down
110 changes: 91 additions & 19 deletions R/analyze.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Correction of Genomes in R
#' @title Calculate FDR and sFDR for a simulated GWAS
#'
#' Software for the Examination of Multiple Correction Methodologies in Accurate Genomic Environments
#'
Expand All @@ -13,15 +13,16 @@
#' @param pc Proportion of causal SNPs in the second strata
#' @param pnc Proportion of noncausal SNPs in the second strata
#' @param nc Number of causal SNPs
#' @param gen Gen matrix if local
#' @param gen Gen matrix if local
#' @param summary Summary file if preread
#' @param maf Maf or no
#' @param maf_range MAF range
#'
#' @import futile.logger
#' @import foreach
#' @import devtools
#' @importFrom lazyeval interp
#' @importFrom data.table fread
#' @importFrom data.table fread
#' @importFrom magrittr %<>%
#' @importFrom dplyr mutate mutate_ filter filter_ select select_ sample_n %>%
#' @importFrom Rcpp evalCpp
Expand All @@ -33,12 +34,24 @@



analyze <- function(i = double(), j = double(), mode = "default", path.base = "/scratch/hpc2862/CAMH/perm_container/container_", summary.file = "/scratch/hpc2862/CAMH/perm_container/snp_summary2.out", output = "~/repos/coR-ge/data/test_run2.txt", test = TRUE, safe = TRUE, local = FALSE, h2 = 0.45, pc = 0.5, pnc = 0.5, nc = 1000, gen = NULL, summary = summary, maf = FALSE, maf_range = NULL){


#message(paste0("coR-ge v", packageVersion("coRge"), " \t \t http://github.io/Chris1221/coR-ge \n \n \t Parameters in use: \n \t \t i: ",i, "\n \t \t j: ", j, "\n \t \t path.base: ", path.base,"\n \t \t summary.file: ", summary.file,"\n \t \t output: ", output,"\n \t \t th2: ", h2,"\n \t \t pc: ", pc,"\n \t \t pnc: ", pnc,"\n \t \t nc: ", nc, "\n \n \t Local options: \n \t \t local: ", local,"\n \t \t gen ", gen,"\n \t \t summary: ", summary, " \n \n \t Testing options: \n \t \t test: ", test,"\n \t \t safe: ", safe, "\n \n LOG: \n"))


analyze <- function(i = NA,
j = NA,
mode = "default",
path.base = "/scratch/hpc2862/CAMH/perm_container/container_",
summary.file = "/scratch/hpc2862/CAMH/perm_container/snp_summary2.out",
output = "~/repos/coR-ge/data/test_run2.txt",
test = TRUE,
safe = TRUE,
local = FALSE,
h2 = 0.45,
pc = 0.5,
pnc = 0.5,
nc = 1000,
gen = NULL,
summary = summary,
maf = FALSE,
maf_range = NULL,
gene_kb = 0){

if(local) {

Expand All @@ -49,7 +62,7 @@ analyze <- function(i = double(), j = double(), mode = "default", path.base = "/

message("Error Checking")

if(any(is.null(c(i,j,path.base, summary.file)))) stop("Please complete all input arguemnets")
#if(any(is.null(c(path.base, summary.file)))) stop("Please complete all input arguemnets")

path <- paste0(path.base,i,"_",j,"/")
setwd(path)
Expand Down Expand Up @@ -132,7 +145,7 @@ analyze <- function(i = double(), j = double(), mode = "default", path.base = "/

out <- correct(strata=strata, n_strata = n_strata, assoc = P_list, group = FALSE)




# ___ _
Expand Down Expand Up @@ -241,24 +254,83 @@ analyze <- function(i = double(), j = double(), mode = "default", path.base = "/
# Initiate it with 0 and then sequentially overwrite.
strata$ld <- 0

# Calculate the LD from ld.cpp
# Calculate the LD from ld.cpp
LdList <- ld_cor(snps, gen)

# Loop through each of the LD TP thresholds.
for(th in c(0.2, 0.4, 0.6, 0.8, 0.9, 1)){

# Create a list of SNPs which have higher than the threshold
# LD.
snp_b <- LdList$rsid[LdList$ld > th]

# Assign the threshold value to each of these SNPs.
strata$ld[strata$rsid %in% snp_b] <- th
}

out <- correct(strata=strata, n_strata = n_strata, assoc = P_list, group = FALSE, mode = "ld")


}
} else if(mode == "genes"){


message("Selecting Causal SNPs")

snps <- causal.snps(summary, nc = nc, maf = maf)
colnames(snps)[3] <- "V3"


message("Merging together and converting from Oxford to R format...")

comb <- as.data.frame(merge(gen, snps, by = "V3"))

comb$rsid <- NULL
comb$chromosome <- NULL
comb$all_maf <- NULL
comb$k <- NULL
comb$chromosomes <- NULL

WAS <- calculate_was(gen = comb, snps = snps, h2 = h2)

samp$Z <- as.character(foreach(q = 1:length(WAS), .combine = 'c') %do% WAS[q] + rnorm(1, 0, sd = sqrt(1-h2)))


var <- data.frame(0, 0, 0, "P")
samp$Z <- as.character(samp$Z)
colnames(var) <- colnames(samp)
samp <- rbind(var, samp)

P_list <- assoc_wrapper(gen, samp)

message("Performing correction")

snps %>% select(rsid) %>% as.vector -> snp_list

#this is a major assumption so leave it
n_strata <- 2
strata <- stratify(snp_list = snp_list, summary = summary, n_strata = n_strata, pc = pc, pnc = pnc, mode = "genes", gene_kb = gene_kb)

# Initialize a column for the LD to sit in
# Initiate it with 0 and then sequentially overwrite.
strata$ld <- 0

# Calculate the LD from ld.cpp
LdList <- ld_cor(snps, gen)

# Loop through each of the LD TP thresholds.
for(th in c(0.2, 0.4, 0.6, 0.8, 0.9, 1)){

# Create a list of SNPs which have higher than the threshold
# LD.
snp_b <- LdList$rsid[LdList$ld > th]

# Assign the threshold value to each of these SNPs.
strata$ld[strata$rsid %in% snp_b] <- th
}
out <- correct(strata=strata, n_strata = n_strata, assoc = P_list, group = FALSE, mode="ld")


}


out$h2 <- h2
Expand All @@ -270,10 +342,10 @@ analyze <- function(i = double(), j = double(), mode = "default", path.base = "/
out$maf_l <- maf_range[1]
out$maf_u <- maf_range[2]

write.table(out, file = output, append = T, quote = F, row.names = F, col.names = F)
# if(!file.exists(output)) suppressWarnings(write.table(out, output, row.names = F, col.names = TRUE, quote = F, append = T)) else if(file.exists(output)) write.table(out, output, row.names = F, col.names = F, quote = F, append = T)

if(!file.exists(output)) suppressWarnings(write.table(out, output, row.names = F, col.names = TRUE, quote = F, append = T)) else if(file.exists(output)) write.table(out, output, row.names = F, col.names = F, quote = F, append = T)


message("This has passed the writting")
return(0)

}
2 changes: 1 addition & 1 deletion R/onAttach.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
if (interactive()) {
v = packageVersion("coRge")

packageStartupMessage(paste0("coRge v", v, ". This version is unstable and should not be used in development. More information and guide at https://github.com/chris1221/cor-ge."))
packageStartupMessage(paste0("coRge v", v, ".\nGuides, issues, and FAQ: https://github.com/Chris1221/coRge \nPrepublication:\t\t arXiv: XXXXX \nCitation:\t\t citation(\"coRge\")"))


}
Expand Down
81 changes: 70 additions & 11 deletions R/stratify.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,23 +13,82 @@
#'
#' @export

stratify <- function(snp_list = NULL, summary = NULL, pnc = NULL, pc = NULL, n_strata = NULL){
stratify <- function(snp_list = NULL, summary = NULL, pnc = NULL, pc = NULL, n_strata = NULL, mode = "default", gene_kb = 0){

flog.debug("Entering into the stratify() function block")

if(any(c(is.null(snp_list), is.null(summary), is.null(n_strata)))) stop("Missing a required input arguement")

h0_summary <- summary %>% filter(!(rsid %in% unlist(snp_list))) %>% mutate(h1 = F)
h1_summary <- summary %>% filter(rsid %in% unlist(snp_list)) %>% mutate(h1 = T)

h1_s1 <- h1_summary %>% sample_n(floor(nrow(h1_summary)*pc))
h1_s2 <- h1_summary %>% filter(!(rsid %in% h1_s1$rsid))
if(mode == "genes"){

flog.debug("Gene mode selected")

h0_summary <- summary %>%
filter(!(rsid %in% unlist(snp_list))) %>%
mutate(h1 = F)
h1_summary <- summary %>%
filter(rsid %in% unlist(snp_list)) %>%
mutate(h1 = T)

flog.trace(paste0("There are ", nrow(h1_summary), " causal SNPs in the current analysis and ", nrow(h0_summary), " non causal SNPs."))

# Run through each SNP in the h1 summary and give it a gene
# identifyier to be used in the gene column.

# Randomly put floor(pc*nrow(h1_summary)) causal into the h1
# and the rest into the second strata.
h1_s1 <- h1_summary %>%
sample_n(floor(nrow(h1_summary)*pc))
h1_s2 <- h1_summary %>%
filter(!(rsid %in% h1_s1$rsid))

s1 <- h0_summary %>%
filter(position > h1_s1$position[1] - gene_kb*1000 & position < h1_s1$position[1] + gene_kb*1000) %>% nrow

for(i in 2:nrow(h1_s1)){
s1 <- h0_summary %>%
filter(position > h1_s1$position[1] - gene_kb*1000 & position < h1_s1$position[1] + gene_kb*1000) %>%
rbind(s1)
}

s2 <- h0_summary[!(h0_summary$rsid %in% s1$rsid),]

s1 %<>% rbind(h1_s1) %>%
mutate(s = 1)
s2 %<>% rbind(h1_s2) %>%
mutate(s = 2)

strata <- rbind(s1, s2)



} else {

h0_summary <- summary %>%
filter(!(rsid %in% unlist(snp_list))) %>%
mutate(h1 = F)
h1_summary <- summary %>%
filter(rsid %in% unlist(snp_list)) %>%
mutate(h1 = T)

h1_s1 <- h1_summary %>%
sample_n(floor(nrow(h1_summary)*pc))
h1_s2 <- h1_summary %>%
filter(!(rsid %in% h1_s1$rsid))

s1 <- h0_summary %>%
sample_n(floor(nrow(h0_summary)*pnc) - length(snp_list))
s2 <- h0_summary %>%
filter(!(rsid %in% s1$rsid))

s1 <- h0_summary %>% sample_n(floor(nrow(h0_summary)*pnc) - length(snp_list))
s2 <- h0_summary %>% filter(!(rsid %in% s1$rsid))
s1 %<>% rbind(h1_s1) %>%
mutate(s = 1)
s2 %<>% rbind(h1_s2) %>%
mutate(s = 2)

s1 %<>% rbind(h1_s1) %>% mutate(s = 1)
s2 %<>% rbind(h1_s2) %>% mutate(s = 2)
strata <- rbind(s1, s2)
}

strata <- rbind(s1, s2)
return(strata)

}
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@

| Branch | Travis-CI | Appveyor | Coverage | CRAN | Downloads | Publication |
| :--- | :---: | :---: | :--: | :---: | :---: | :---: |
| `master` | ![Build Status](https://travis-ci.org/Chris1221/coR-ge.svg?branch=master) | ![Build status](https://ci.appveyor.com/api/projects/status/v64oe85q29btxln9?svg=true) | [![codecov.io](https://codecov.io/github/Chris1221/coR-ge/coverage.svg?branch=master)](https://codecov.io/github/Chris1221/coR-ge?branch=master) | ![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/coRge) | ![](http://cranlogs.r-pkg.org/badges/coRge) | GitXiv |
| `devel` |![Build Status](https://travis-ci.org/Chris1221/coR-ge.svg?branch=devel) | [![Build status](https://ci.appveyor.com/api/projects/status/v64oe85q29btxln9?svg=true)](https://ci.appveyor.com/project/Chris1221/miner) | [![codecov.io](https://codecov.io/github/Chris1221/coR-ge/coverage.svg?branch=devel)](https://codecov.io/github/Chris1221/coR-ge?branch=devel) | ![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/coRge) | ![](http://cranlogs.r-pkg.org/badges/coRge) | GitXiv |
| `master` | ![Build Status](https://travis-ci.org/Chris1221/coRge.svg?branch=master) | ![Build status](https://ci.appveyor.com/api/projects/status/v64oe85q29btxln9?svg=true) | [![codecov.io](https://codecov.io/github/Chris1221/coRge/coverage.svg?branch=master)](https://codecov.io/github/Chris1221/coRge?branch=master) | ![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/coRge) | ![](http://cranlogs.r-pkg.org/badges/coRge) | GitXiv |
| `devel` |![Build Status](https://travis-ci.org/Chris1221/coRge.svg?branch=devel) | [![Build status](https://ci.appveyor.com/api/projects/status/v64oe85q29btxln9?svg=true)](https://ci.appveyor.com/project/Chris1221/miner) | [![codecov.io](https://codecov.io/github/Chris1221/coRge/coverage.svg?branch=devel)](https://codecov.io/github/Chris1221/coRge?branch=devel) | ![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/coRge) | ![](http://cranlogs.r-pkg.org/badges/coRge) | GitXiv |

-----------------------------------

To install:

```R
if(!require(devtools)) install.packages("devtools")
devtools::install_github("Chris1221/coR-ge")
devtools::install_github("Chris1221/coRge")
```

To run:
Expand Down
19 changes: 19 additions & 0 deletions data/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# History files
.Rhistory
.Rapp.history

# Session Data files
.RData

# Example code in package build process
*-Ex.R

# RStudio files
.Rproj.user/

# produced vignettes
vignettes/*.html
vignettes/*.pdf

# OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
.httr-oauth
26 changes: 26 additions & 0 deletions data/R/test_run.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/Rscript

zz <- file("~/repos/coR-ge/data/test_run.sink", open = "wt")

sink(zz)
sink(zz, type = "message")

if(!require(devtools)) install.packages("devtools")
devtools::install_github("Chris1221/coR-ge", ref = "devel")
library(coRge)

library(dplyr)
library(foreach)
library(magrittr)
library(data.table)

for(h2 in seq(0.1, 0.9, by = 0.1)){

coRge::analyze(i = 3, j = 3, h2 = h2)

}

sink(type = "message")
sink()

system("touch ~/repos/coR-ge/data/done")
10 changes: 10 additions & 0 deletions data/makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Makefile for analysis
# Going from analysis script to output file
#
# Will have to change this probably

all: test_run.txt

test_run.txt: pipeline2.sh pipeline2.R
bash pipeline2.sh

Binary file added data/misc/MasterController.pdf
Binary file not shown.
Loading