Skip to content

Commit 0538ad2

Browse files
authored
Update migec_summary.Rmd
Line 169: We added unique as the factor levels were reading duplicates and causing an error. Line 209: We edited that line because the assemble.log.txt file did not contain a column under the name 'READS DROPPED WITHIN MIG'. It contained individual columns for MIG1 and MIG2 for fastq1 and fastq2. We summed it to suit the function. We suppressed the warnings - `guides(<scale> = FALSE)` is deprecated. Please use `guides(<scale> = "none")` instead. We DID NOT update the plotting functions, instead we suppressed the warning message to temporarily ignore the warnings on plots.
1 parent afe59a8 commit 0538ad2

File tree

1 file changed

+12
-11
lines changed

1 file changed

+12
-11
lines changed

util/migec_summary.Rmd

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -166,7 +166,7 @@ colnames(df) <- c("sample", "sample.type", "threshold", "peak", "mig.size", "cou
166166
# summarize by sample type, normalize within sample
167167
df <- aggregate(count ~ sample + mig.size + threshold + peak, data=df, FUN=sum)
168168
df.n <- ddply(df,.(sample),transform,count=count/sum(count))
169-
df.n$sample <-factor(df.n$sample, levels=df.n[order(df.n$peak), "sample"])
169+
df.n$sample <-factor(df.n$sample, levels=unique(df.n[order(df.n$peak), "sample"]))
170170
171171
# plotting
172172
@@ -206,6 +206,7 @@ require(scales)
206206
207207
if (!is.null(assemble_path)) {
208208
df <- read.table(paste(assemble_path, "/assemble.log.txt", sep = "/"), header=T, comment ="")
209+
df$READS_DROPPED_WITHIN_MIG = df$READS_DROPPED_WITHIN_MIG_1 + df$READS_DROPPED_WITHIN_MIG_2
209210
df <- data.frame(sample <- df$X.SAMPLE_ID,
210211
migs.assembled <- df$MIGS_GOOD_TOTAL,
211212
umi.fraction.assembled <- df$MIGS_GOOD_TOTAL / df$MIGS_TOTAL,
@@ -242,7 +243,7 @@ plotAsm.2 <- function(dd) {
242243

243244
Below is a plot showing the total number of assembled MIGs per sample. The number of MIGs should be interpreted as the total number of starting molecules that have been successfully recovered.
244245

245-
```{r, echo=FALSE, message=FALSE}
246+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
246247
p<-"Nothing to plot"
247248
if (!is.null(assemble_path)) {
248249
df.1 <- subset(df, variable == "migs.assembled")
@@ -357,7 +358,7 @@ plotCdr.2 <- function(dd) {
357358

358359
The plot below shows the total number of MIGs that contain good-quality CDR3 region in the consensus sequence
359360

360-
```{r, echo=FALSE, message=FALSE}
361+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
361362
p <- "Nothing to plot"
362363
if (!is.null(cdrblast_path)) {
363364
df.s <- subset(df, variable == "final.count" & type == "asm" & metric == "mig")
@@ -370,7 +371,7 @@ p
370371

371372
Total number of reads that contain good-quality CDR3 region in raw reads
372373

373-
```{r, echo=FALSE, message=FALSE}
374+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
374375
p <- "Nothing to plot"
375376
if (!is.null(cdrblast_path)) {
376377
df.s <- subset(df, variable == "final.count" & type == "asm" & metric == "read")
@@ -385,7 +386,7 @@ Mapping rate, the fraction of reads/MIGs that contain a CDR3 region
385386

386387
> Panels show assembled (**asm**) and unprocessed (**raw**) data. Values are given in number of molecules (**mig**, assembled samples only) and the corresponding read count (**read**)
387388
388-
```{r, echo=FALSE, message=FALSE}
389+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
389390
p <- "Nothing to plot"
390391
if (!is.null(cdrblast_path)) {
391392
df.s <- subset(df, variable == "map.rate")
@@ -400,7 +401,7 @@ Good-quality CDR3 sequence rate, the fraction of CDR3-containing reads/MIGs that
400401

401402
> Note that while raw data is being filtered based on Phred quality score, consensus quality score (CQS, the ratio of major variant) is used for assembled data
402403
403-
```{r, echo=FALSE, message=FALSE}
404+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
404405
p <- "Nothing to plot"
405406
if (!is.null(cdrblast_path)) {
406407
df.s <- subset(df, variable == "qual.rate")
@@ -470,7 +471,7 @@ plotCdrFinal.2 <- function(dd) {
470471

471472
Below is the plot of sample diversity, i.e. the number of clonotypes in a given sample
472473

473-
```{r, echo=FALSE, message=FALSE}
474+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
474475
p <- "Nothing to plot"
475476
if (!is.null(cdrfinal_path)) {
476477
df.1 <- data.frame(sample = df$sample, value = df$clones.count)
@@ -483,7 +484,7 @@ p
483484

484485
Total number of molecules (MIGs) in final clonotype tables
485486

486-
```{r, echo=FALSE, message=FALSE}
487+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
487488
p <- "Nothing to plot"
488489
if (!is.null(cdrfinal_path)) {
489490
df.1 <- data.frame(sample = df$sample, value = df$migs.count)
@@ -498,7 +499,7 @@ Rate of hot-spot and singleton error filtering, in terms of clonotypes (**clone*
498499

499500
> As clonotypes represented by a single MIG (singletons) have insufficient info to apply MiGEC-style error filtering, a simple frequency-based filtering is used for them.
500501
501-
```{r, echo=FALSE, message=FALSE}
502+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
502503
p <- "Nothing to plot"
503504
if (!is.null(cdrfinal_path)) {
504505
df.2 <- data.frame(sample = df$sample, mig = df$migs.filter.rate, clone = df$clones.filter.rate)
@@ -512,12 +513,12 @@ p
512513

513514
Rate of non-coding CDR3 sequences, in terms of clonotypes (**clone** panel) and MIGs (**mig** panel)
514515

515-
```{r, echo=FALSE, message=FALSE}
516+
```{r, echo=FALSE, message=FALSE, warning=FALSE}
516517
p <- "Nothing to plot"
517518
if (!is.null(cdrfinal_path)) {
518519
df.2 <- data.frame(sample = df$sample, mig = df$migs.nc.rate, clone = df$clones.nc.rate)
519520
df.2 <- melt(df.2)
520521
p<-plotCdrFinal.2(df.2)
521522
}
522523
p
523-
```
524+
```

0 commit comments

Comments
 (0)