Update to DisGeNet 5? #22

tonigi · 2017-12-11T12:59:08Z

I am under the impression that the DGN database included with the package corresponds to DisGeNet release 4. Would it be possible to upgrade it to version 5? There is now a distribution in sqlite format which may make the transition easier.

A more complex task would be to keep the "Evidence index" or "Source" annotations, in order to filter out weak/negative associations.

Thanks!

tonigi · 2017-12-11T14:27:24Z

Actually, I was able to regenerate them with minor changes to your script. In particular, got the file ALL variant-disease associations and changed inst/extdata/build_DGN_Anno.R as follows

x <- read.delim("all_gene_disease_associations.tsv.gz", comment.char="#",
            stringsAsFactor=F, fileEncoding="ISO-8859-1")  
d2n <- unique(x[, c(3, 4)])  # New columns - using names would be better
d2g <- unique(x[, c(3, 1)])
# No need to special-case non-ASCII chars any more

Still needs checking; e.g., somewhat oddly, the total number of annotations is now 17074 (before was 17381).

GuangchuangYu · 2017-12-12T08:52:49Z

If you figure it out, a PR is welcome.

merge

GuangchuangYu pushed a commit that referenced this issue Dec 10, 2023

Merge pull request #22 from YuLab-SMU/devel

5bc5127

merge

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to DisGeNet 5? #22

Update to DisGeNet 5? #22

tonigi commented Dec 11, 2017

tonigi commented Dec 11, 2017 •

edited

Loading

GuangchuangYu commented Dec 12, 2017

Update to DisGeNet 5? #22

Update to DisGeNet 5? #22

Comments

tonigi commented Dec 11, 2017

tonigi commented Dec 11, 2017 • edited Loading

GuangchuangYu commented Dec 12, 2017

tonigi commented Dec 11, 2017 •

edited

Loading