-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About kallisto files #14
Comments
Hi Woody,
Yes you are right - please use the Kallisto index built by using protein coding transcripts in Gencode v19 index. More Kallisto support would be added in the future so that the RBP-tpm generation is more standardized. Sorry for the inconvenience, as this is currently hard-coded conversions; but for now please build Kallisto index using the Gencode FASTA sequences here: ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_19/gencode.v19.pc_transcripts.fa.gz
… On Nov 16, 2019, at 10:22 AM, Woody Lin ***@***.***> wrote:
Hi Dr. Zhang,
I am trying the following commend to run DARTS:
Darts_DNN build_feature -i bayes_infer/A5SS.darts_bht.flat.txt -c ~/.darts/DNN/v0.1.0/trainedParam/A5SS-trainedParam-EncodeRoadmap.h5 -e Sample_WT_kallisto Sample_KD_kallisto -o A5SS_data.h5 --t A5SS
I got the following error message:
2019-11-16 10:14:12,982 - Darts_DNN.build_feature - INFO - convert tx to gene TPM Traceback (most recent call last): ...skip... KeyError: 'ENST00000631435'
Does this mean that I am using the wrong files (or wrong version of gene annotation) from kallisto?
Files in the kallisto folder (based on Ensemble v96):
abundance.h5 abundance.tsv run_info.json
Thanks,
Woody
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#14?email_source=notifications&email_token=ADHQFZ6VDTKN7FUECSICARTQUAF3PA5CNFSM4JOFS4OKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HZZN43A>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADHQFZZZNRCC7JVMOL5CFF3QUAF3PANCNFSM4JOFS4OA>.
|
Hi Dr. Zhang, Thanks for your quick reply. I proceeded with gencode v19, installed the python module "tables", and found the follow error: ` File "/anaconda3/envs/darts/lib/python2.7/site-packages/Darts_DNN-0.1.0-py2.7.egg/EGG-INFO/scripts/Darts_DNN", line 49, in main File "/anaconda3/envs/darts/lib/python2.7/site-packages/Darts_DNN-0.1.0-py2.7.egg/Darts_DNN/Darts_build_feature.py", line 157, in parser
Thanks, |
@wososa Please use |
@zj-zhang Thanks for your reply. |
@wososa Not necessarily, actually. For example, you can run Darts_DNN predict -i darts_flat/Sp_out.txt \
-o darts_pred.txt \
-e kallisto/Day5_rep1/,kallisto/Day5_rep2/,kallisto/Day5_rep3/ kallisto/No_Dox_rep1/,kallisto/No_Dox_rep2/,kallisto/No_Dox_rep3/ It was illustrated in the help message by running Darts_DNN predict with $ Darts_DNN predict -h
usage: Darts_DNN predict [-h] -i INPUT -o OUTPUT [-t {SE,A5SS,A3SS,RI}]
[-e EXPR [EXPR ...]] [-m MODEL]
optional arguments:
-h, --help show this help message and exit
-i INPUT Input feature file (*.h5) or Darts_BHT output (*.txt)
-o OUTPUT Output filename
-t {SE,A5SS,A3SS,RI} Optional, default SE: specify the alternative splicing
event type. SE: skipped exons, A3SS: alternative 3
splice sites, A5SS: alternative 5 splice sites, RI:
retained introns
-e EXPR [EXPR ...] Optional, required if input is Darts_BHT output;
Folder path for Kallisto expression files; e.g '-e
Ctrl_rep1,Ctrl_rep2 KD_rep1,KD_rep2'
-m MODEL Optional, default using current version model in user
home directory: Filepath for a specific model
parameter file Hope this helps. |
In fact, in case it might be potentially useful for others, let me add that using |
I can understand now. Thanks! |
@zj-zhang My |
@wososa Most likely it's because the majority of the A5SS in your file does not have pre-compiled cis-sequence features. Could you check the ID overlapping between |
@zj-zhang Thanks for your quick reply. If the number of overlapping events is small, does it mean that my A5SS events are new to the gencode annotation? I probably can't process the big amount of RNA-seq datasets in DARTS-DNN to re-generate the features. |
Yes if number of overlapping events is small, that means the A5SS events are likely novel events specific in your RNA-seq data. The sequence features were compiled by @zcpan ; If that's indeed the case, I will open a new issue for that so we could better keep track. |
I am not too sure what went wrong but appears that Darts_DNN is not recognizing input directory supplied with -e parameter constructing in-memory feature matrix File "/anaconda3/envs/darts/lib/python2.7/site-packages/Darts_DNN-0.1.0-py2.7.egg/EGG-INFO/scripts/Darts_DNN", line 44, in main File "/envs/darts/lib/python2.7/site-packages/Darts_DNN-0.1.0-py2.7.egg/Darts_DNN/Darts_pred.py", line 103, in parser Any suggestions how to sort this out? |
Hi Dr. Zhang,
I am trying the following commend to run DARTS:
Darts_DNN build_feature -i bayes_infer/A5SS.darts_bht.flat.txt -c ~/.darts/DNN/v0.1.0/trainedParam/A5SS-trainedParam-EncodeRoadmap.h5 -e Sample_WT_kallisto Sample_KD_kallisto -o A5SS_data.h5 --t A5SS
I got the following error message:
2019-11-16 10:14:12,982 - Darts_DNN.build_feature - INFO - convert tx to gene TPM Traceback (most recent call last): ...skip... KeyError: 'ENST00000631435'
Does this mean that I am using the wrong files (or wrong version of gene annotation) from kallisto?
Files in the kallisto folder (based on Ensemble v96):
abundance.h5 abundance.tsv run_info.json
Thanks,
Woody
The text was updated successfully, but these errors were encountered: