-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with collate function #9
Comments
Could I ask you to inspect the contents of the two files?
I wonder if one of the files is missing a result. |
Hi! I inspected the files and they seem valid. Each sample has one line in duplicate_metrics and 3 lines in alignment_metrics. No empty lines, no duplicates... I even realigned my samples in case something went wrong there, but still get the same error. Anything else I could check? |
If you could share the output files, I might be able to fix the code to work with your files. |
I was confused because you added You cannot open them with Excel: You can read the contents anyway:
I tried running the R code in the picardmetrics script: Lines 929 to 932 in 94cb651
It looks like your files are ok: setwd("~/Downloads/8Fat")
read_tsv <- function(filename, ...) {
if (!file.exists(filename)) {
warning("File does not exist: ", filename)
return(NULL)
}
dat <- read.delim(filename, stringsAsFactors = FALSE, ...)
return(dat)
}
dat_align_metrics <- read_tsv("8Fat-alignment-metrics.tsv")
dat_duplicate_metrics <- read_tsv("8Fat-duplicate-metrics.tsv")
idx = !dat_align_metrics$CATEGORY %in% c("FIRST_OF_PAIR", "SECOND_OF_PAIR")
dat_align_metrics = dat_align_metrics[idx, ]
all(dat_align_metrics$SAMPLE == dat_duplicate_metrics$SAMPLE)
#> [1] TRUE Created on 2018-10-11 by the reprex package (v0.2.0). Here are my suggestions:
Good luck! |
Ok, I'll try! Thanks! |
Hi! Thanks for this wonderful application, it helped me a lot.
However, I ran into an issue with the collate function I can't figure out to solve:
picardmetrics collate filename /path/filename/
picardmetrics version 0.2.4 2016-07-06
2018-09-17 10:47:47 START filename
2018-09-17 10:47:47 Collating 96 alignment_summary_metrics files
2018-09-17 10:47:53 Collating 96 quality_distribution_metrics files
2018-09-17 10:47:57 Collating 96 rnaseq_metrics files (summary)
2018-09-17 10:48:01 Collating 96 rnaseq_metrics files (coverage)
2018-09-17 10:48:05 Collating 96 gc_bias_metrics files
2018-09-17 10:48:07 Collating 96 gc_bias_histogram files
2018-09-17 10:48:12 Collating 96 duplicate_metrics files
2018-09-17 10:48:16 Collating 96 insert_size_metrics files
2018-09-17 10:48:21 Collating 96 insert_size_metrics files (histogram)
2018-09-17 10:48:25 Collating 96 base_distribution_by_cycle files
2018-09-17 10:48:32 Collating 96 library_complexity files
2018-09-17 10:48:36 Collating 96 library_complexity files (histogram)
2018-09-17 10:48:40 Collating 96 mapq_stats files
2018-09-17 10:48:41 Joining all files into 'filename-all-metrics.tsv'
Error: all(dat_align_metrics$SAMPLE == dat_duplicate_metrics$SAMPLE) is not TRUE
Execution halted
2018-09-17 10:48:42 DONE filename
I don't understand where the problem lies as there are 96 files in each category. When I count the lines in the intermediate files, it give 97 for duplicate_metrics and 289 for alignment_metrics.
Thanks
Alex
The text was updated successfully, but these errors were encountered: