Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty output from CreateNetwork #260

Closed
swethas112 opened this issue Apr 4, 2024 · 3 comments
Closed

Empty output from CreateNetwork #260

swethas112 opened this issue Apr 4, 2024 · 3 comments
Labels

Comments

@swethas112
Copy link

Hello,

Thanks for the great tool.
I was hoping to identify networks with CreateNetwork module using the command:
TOBIAS CreateNetwork --TFBS annotated/*/beds/*Knockout_bound.bed --origin motif2gene_mapping.txt
I was not getting any networks in the output files. Could you help me figure out if I had missed something? I am attaching one of the bed files and the origin file for reference.

Thanks,
Swetha

motif2gene_mapping.txt

TFBS:

chr1	11274014	11274021	GATA1	8.20623	+	chr1	11272967	11274172	Knockout,Wildtype	.	.	start_codon	11273531	11273534	+	end	35	FeatureInsidePeak	0.002	1.0	NA	CCDS81260.1	ENSE00001471726.1	1	ENSG00000120942.14	UBIAD1	protein_coding	OTTHUMG00000002075.2	OTTHUMT00000005775.2	HGNC:30791	2	NA	ENSP00000366000.1	basic,CCDS	ENST00000376804.2	UBIAD1-201	2	protein_coding	query_1	1.30158
chr1	11671620	11671627	GATA1	8.20623	-	chr1	11671222	11672148	Knockout,Wildtype	.	.	CDS	11671927	11672023	+	start	242	FeatureInsidePeak	0.104	1.0	NA	CCDS133.1	ENSE00000818922.1	4	ENSG00000116663.11	FBXO6	protein_coding	OTTHUMG00000002229.2	OTTHUMT00000006332.2	HGNC:13585	2	NA	ENSP00000365944.4	basic,Ensembl_canonical,MANE_Select,appris_principal_1,CCDS	ENST00000376753.9	FBXO6-201	1	protein_coding	query_1	2.79763
chr1	11842441	11842448	GATA1	8.20623	-	chr1	11842018	11843006	Knockout,Wildtype	.	.	transcript	11806190	11843130	+	end	618	PeakInsideFeature	1.0	0.027	NA	CCDS138.1	NA	NA	ENSG00000011021.23	CLCN6	protein_coding	OTTHUMG00000002299.8	OTTHUMT00000006639.3	HGNC:2024	2	NA	ENSP00000234488.9	basic,Ensembl_canonical,MANE_Select,appris_principal_1,CCDS	ENST00000346436.11	CLCN6-202	1	protein_codinquery_1	6.98205
chr1	16206462	16206469	GATA1	9.08368	+	chr1	16206170	16207303	Knockout,Wildtype	.	.	CDS	16206947	16207210	-	end	211	FeatureInsidePeak	0.232	1.0	NA	CCDS170.1	ENSE00000955436.1	6	ENSG00000142632.17	ARHGEF19	protein_coding	OTTHUMG00000002219.5	OTTHUMT00000006289.2	HGNC:26604	2	NA	ENSP00000270747.3	basic,Ensembl_canonical,MANE_Select,appris_principal_1,CCDS	ENST00000270747.8	ARHGEF19-201	1	protein_coding	query_1	1.53035
chr1	16644124	16644131	GATA1	8.20623	+	chr1	16643710	16645115	Knockout,Wildtype	.	.	exon	16644645	16644683	-	end	233	FeatureInsidePeak	0.027	1.0	NA	NA	ENSE00001411231.2	ENSG00000291072.1	ENSG00000291072	lncRNA	NA	OTTHUMT00000092783.2	NA	2	NA	NA	basic	ENST00000362058.2	ENST00000362058	2	lncRNA	query_1	13.18856```
@mohobein
Copy link
Collaborator

mohobein commented Apr 9, 2024

Hey Swetha,

for CreateNetwork to work, the bed file should contain the information about the target gene for each TFBS. The ID for the target gene also needs to be present in your --origin file to link them accordingly. If you look at your bed file, all your ensemble IDs have a version number attached to them (ENSG00000116663.11 -> ENSG00000116663 version 11). In your motif2gene_mapping.txt file, the genes do not carry version numbers. I suspect that this might be the reason you do not get any networks.

Perhaps your problem can be solved by removing the version numbers from all gene IDs in you bed file, or by using a mapping file that also includes version numbers. They have to match to be able to identify networks.

Also make sure that both files correspond to the same organism, but both your bed file and your mapping file contain human gene IDs, so this should be able to work.

If this does not work, could you please run the tool using the argument --verbosity 4 to enable debug printouts? These would be helpful to me for identifying the problem.

I hope this solves your issue.

Best regards,
Moritz

@hyBio
Copy link

hyBio commented Apr 19, 2024

Hi, I have the following three questions for CreateNetwork:

  1. The two columns in the motif2gene_mapping.txt file are supposed to be the motif name \t gene name or the gene name \t gene product name as shown below, which is very confusing to me.
    image
  2. does the first column of motif2gene_mapping.txt need to match with the fourth column of TFBS, and do I need to adjust accordingly if I customize the motif name?
  3. If it is a non-model species, how should I get the motif and its regulated gene set, can I just use the motif2gene_mapping.txt in test data?
    Looking forward to your reply, thanks a lot.

Copy link

No activity for at least 30 days. Marking issue as stale. Stale issues are closed after one week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants