-
Notifications
You must be signed in to change notification settings - Fork 14
ImReP output
ImReP output is tab separated file consisting of the following fields:
-
CDR3 amino acid sequence (we refer to it as clonotype)
-
Read count supporting the clonotype
-
V gene
-
D gene
-
J gene
Also, for an extended output of ImReP, please run the following command:
$ python imrep.py unmappedReads.fastq output.txt --extendedOutput
Two additional files will be generated:
-
full_cdr3.txt - full CDR3 output
-
partial_cdr3.txt - partial CDR3 output, i.e. reads having intersecting either a V or a J gene
Both files provide information for each read, i.e. for each read the user can trace CDR3 region it comes from. More detailed, both files have the following format:
-
Field 1: read name
-
Field 2: CDR3 sequence (full CDR3 in full_cdr3.txt and partial CDR3 for partial_cdr3.txt) it takes part in
-
Field 3: V chain
-
Field 4: D chain
-
Field 5: J chain
-
Field 6: comma separated list of V gene alleles the read intersects with. Each item in the list has the format "allele_name:overlap_aminoacids:mismatches_aminoacids"
-
Field 7: comma separated list of J gene alleles the read intersects with. Each item in the list has the format "allele_name:overlap_aminoacids:mismatches_aminoacids"
-
Field 8: binary flag. Equals 1 if the list in Field 6 has only one item. 0 otherwise
-
Field 9: binary flag. Equals 1 if the list in Field 7 has only one item. 0 otherwise
-
Field 10: binary flag. Equals 1 if V and J alleles are determined uniquely. Equal to (Field 8) & (Field 9)