Skip to content

ImReP output

Igor edited this page Dec 25, 2016 · 3 revisions

ImReP output is tab separated file consisting of the following fields:

  • CDR3 amino acid sequence (we refer to it as clonotype)

  • Read count supporting the clonotype

  • V gene

  • D gene

  • J gene

Also, for an extended output of ImReP, please run the following command:

$ python imrep.py unmappedReads.fastq output.txt --extendedOutput

Two additional files will be generated:

  • full_cdr3.txt - full CDR3 output

  • partial_cdr3.txt - partial CDR3 output, i.e. reads having intersecting either a V or a J gene

Both files provide information for each read, i.e. for each read the user can trace CDR3 region it comes from. More detailed, both files have the following format:

  • Field 1: read name

  • Field 2: CDR3 sequence (full CDR3 in full_cdr3.txt and partial CDR3 for partial_cdr3.txt) it takes part in

  • Field 3: V chain

  • Field 4: D chain

  • Field 5: J chain

  • Field 6: comma separated list of V gene alleles the read intersects with. Each item in the list has the format "allele_name:overlap_aminoacids:mismatches_aminoacids"

  • Field 7: comma separated list of J gene alleles the read intersects with. Each item in the list has the format "allele_name:overlap_aminoacids:mismatches_aminoacids"

  • Field 8: binary flag. Equals 1 if the list in Field 6 has only one item. 0 otherwise

  • Field 9: binary flag. Equals 1 if the list in Field 7 has only one item. 0 otherwise

  • Field 10: binary flag. Equals 1 if V and J alleles are determined uniquely. Equal to (Field 8) & (Field 9)

Clone this wiki locally