Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CITE-seq-Count 100% unmapped #161

Open
carbycrab opened this issue Oct 24, 2021 · 9 comments
Open

CITE-seq-Count 100% unmapped #161

carbycrab opened this issue Oct 24, 2021 · 9 comments

Comments

@carbycrab
Copy link

Hello,

Thank you for developing such an innovative package! I've been trying to run CITE-seq-Count on my 10X V3 data, but keep on getting 100% unmapped returned.

This is the command I used:

CITE-seq-Count   -R1   d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R1_001.fastq.gz   -R2 d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L003_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L004_R2_001.fastq.gz   -t tags.csv -cbf 1 -cbl 16 -umif 17 -umil 28 -cells 5000 -o cite_seq_results/ --start-trim 10

based on grepping our antibody tags returning:

Screen Shot 2021-10-24 at 1 26 03 PM

I have also tried --start-trim: 0,1 and --sliding-window as well and they all return 100% unmapped. I've attached the run_report.yaml below.

Date: 2021-10-19
Running time: 7.0 hours, 19.0 minutes, 16.43 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3266
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 217758
        UMI collapsing threshold: 2      
  UMIs corrected: 8244869
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fa$
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fa$
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
                First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 1

Do you have any idea what I can do to fix this? Thanks so much for your help!

@fjrossello
Copy link

fjrossello commented Oct 26, 2021

Hi,

As far as I can see in the screenshot attached, --start-trim should equal 10 as in --start-trim 10 since your tag sequence starts at the 11th base.

@cpflueger2016
Copy link

(1) Do you mind posting your tag.csv file?

(2) Also, did you try to run CITE-seq-Count on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.

(3) Ideally, if you have a whitelist from cellranger it really helps to run CITE-seq-Count with that so it can assign the antibody barcodes/tags to the relevant cells.

@carbycrab
Copy link
Author

(1) Do you mind posting your tag.csv file?

(2) Also, did you try to run CITE-seq-Count on just on a single fastq file (e.g. L001 only) or a merged fastq file instead of comma separation? Just trying to eliminate potential issues.

(3) Ideally, if you have a whitelist from cellranger it really helps to run CITE-seq-Count with that so it can assign the antibody barcodes/tags to the relevant cells.

@carbycrab carbycrab reopened this Nov 1, 2021
@carbycrab
Copy link
Author

Thank you for your reply.

  1. I've attached tags.csv here. tags.csv

  2. I tried re-running the command (this time including: --whitelist 3M-february-2018.txt and --start-trim 10) on merged fastq files and still get 100% unmapped.
    run_report.yaml:

Date: 2021-10-31
Running time: 19.0 hours, 42.0 minutes, 7.002 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 259436768
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 3246
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 282864
        UMI collapsing threshold: 2
        UMIs corrected: 8391407
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R1.fastq.gz
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_R2.fastq.gz
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
 First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 10

I also tried on just one file. Still 100% unmapped.
run_report.yaml:

Date: 2021-10-27
Running time: 1.0 hour, 28.0 minutes, 24.34 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 65028569
Percentage mapped: 0
Percentage unmapped: 100
Uncorrected cells: 584
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 41861
        UMI collapsing threshold: 2
        UMIs corrected: 3376189
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
   First position: 17
                Last position: 28
        Expected cells: 1000
        Tags max errors: 2
        Start trim: 10

Do you know what could be going on? Thanks so much!

@Hoohm
Copy link
Owner

Hoohm commented Nov 3, 2021

The fact that you still get 100% after the trim change is odd. Seems like it should have fixed it.

Try a run without the whitelist, only run 10000 reads.
trim is 10 for sure

@carbycrab
Copy link
Author

I tried running the command without --white-list on just the first 10000 reads and this what I get:
run_report.yaml:

Date: 2021-11-03
Running time: 5.645 seconds
CITE-seq-Count Version: 1.4.3
Reads processed: 10000
Percentage mapped: 1
Percentage unmapped: 249
Uncorrected cells: 0
Correction:
	Cell barcodes collapsing threshold: 1
        Cell barcodes corrected: 341
        UMI collapsing threshold: 2
        UMIs corrected: 29
Run parameters:
        Read1_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R1_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R1_001.fastq$
        Read2_paths: d3repI_fastq/D3-C-CV-Rep1-FB_S4_L001_R2_001.fastq.gz,d3repI_fastq/D3-C-CV-Rep1-FB_S4_L002_R2_001.fastq$
        Cell barcode:
                First position: 1
                Last position: 16
        UMI barcode:
                First position: 17
                Last position: 28
        Expected cells: 5000
        Tags max errors: 2
        Start trim: 10

@Hoohm
Copy link
Owner

Hoohm commented Nov 21, 2021

I think the trim is at 9 actually, not 10

@carbycrab
Copy link
Author

carbycrab commented Nov 23, 2021

@Hoohm Thanks for that suggestion. I tried with a merged R1/R2 files and separate R1/R2 files, both specifying trim 9, but still got 100% unmapped

@sunta3iouxos
Copy link

@Hoohm may I ask why the 9 trim. the inicial nucleotides are 10.
@carbycrab could it be that the fastq file in addition to the CMOs has ADTs (CMOs for multiplexing, ADTs for cell surface/identity)? Usually the percentage of the latter are very hight

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants