Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing module IDs #7

Open
AlfonsEdbom opened this issue Mar 6, 2023 · 3 comments
Open

Missing module IDs #7

AlfonsEdbom opened this issue Mar 6, 2023 · 3 comments

Comments

@AlfonsEdbom
Copy link

AlfonsEdbom commented Mar 6, 2023

Hi!

I have mapped my sequencing reads against the IGC database (nucleotide sequences), and wanted to run Omixer-RPM using the GBM Database. But I noticed that there are some "genes" that are present in the GBM Database that are not found in the IGC - these include "genes" that are a required "step" in at least one GBM-module.

How should I go about finding/creating a database to map my reads to that contains all relevant information for all modules present in the GBM-database?

Below is a list of "genes" that are required "steps" in a pathway but is not present in the IGC 9.9M "IGC annotation and occurrence frequency summary table":

  • "K03416"
  • "K11782"
  • "K11785"
  • "K17489"
  • "K17490"
  • "K18118"
  • "K19268"
  • "NOG132553"
  • "NOG133663"
  • "bactNOG01844"
  • "bactNOG07881"
  • "bactNOG15341"
  • "bactNOG18519"
  • "firmNOG00626"
  • "firmNOG04290"

King regards,
Alfons

@MireiaVallescolomer
Copy link

Hi Alfons,

This is indeed because there are no genes with these annotations in the 10M IGC. A possible alternative with a more up to date database is to use the KO annotations you can get in HUMAnN3 (https://github.com/biobakery/biobakery/wiki/humann3) and then run omixer-rpm on those to compute GBM coverage and abundance.

Cheers,

Mireia

@AlfonsEdbom
Copy link
Author

Hi!

Thank you for your response!
I am now able to get the missing KO annotations. However, I still cannot find the missing NOG- annotations by translating the HUMAnN3-output into eggNOG-annotations with humann_regroup_table using the latest version of their database for mapping uniref90 to eggNOGs (https://github.com/biobakery/humann/blob/master/humann/data/misc/map_eggnog_uniref90.txt.gz), since none of these annotations are found in this database. Do you know of a different version of the database that contains these eggNOG annotations or if there is a way to translate these missing eggNOGs into another annotation (like Keggs or UniRef90)

Kind regards,
Alfons Edbom

@MireiaVallescolomer
Copy link

Hi Alfons,

This is probably because of different eggNOG versions: we used version 3.0. Translation to Uniref90 IDs is definitely possible, while eggNOG annotations were used when no suitable KO was found. We'll update you when a more updated release of the GBM database is available.

Best,

Mireia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants