Skip to content

Capture exact, related, narrow, broad synonyms in specific fields #175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kevinschaper opened this issue Apr 24, 2025 · 5 comments
Open
Assignees

Comments

@kevinschaper
Copy link
Collaborator

kevinschaper commented Apr 24, 2025

In the Monarch UI, we'd like to distinguish between synonym types (maybe show on hover). I think my ideal would be to continue populating synonyms as-is (possibly all synonyms?), while also capturing exact synonyms in exact_synonym, broad in broad_synonym, etc

@caufieldjh
Copy link
Contributor

This is another area where the KGX format diverts from what Biolink specifies (and maybe another argument for a unified kgx-Biolink schema).

Not a technical limitation by any means, but a meaningful distinction to keep in mind for other ingests and graph transforms.

@matentzn
Copy link

😱 How is this possible - I though KGX was specifically made to be the TSV file format for biolink? How exactly is it different here?

@caufieldjh
Copy link
Contributor

KGX was made to be the format for Biolink and other non-Biolink KGs, so it's not a complete match in all ways.
In this case, the default KGX format has just one synonym field, synonym.
There isn't anything in it explicitly forbidding splitting synonyms by type and even in Biolink the four synonym subtypes are all is_a: synonym:
https://github.com/biolink/biolink-model/blob/c16ad903282be7733676788b75604580f03bf602/biolink-model.yaml#L594-L635

@caufieldjh
Copy link
Contributor

loosely related: biolink/kgx#218

@caufieldjh
Copy link
Contributor

I think this may need to be addressed in kgx as there's data loss when phenio.json is transformed to KGX TSV; the former has synonym types:

      "id" : "http://purl.obolibrary.org/obo/CHEBI_17736",
      "lbl" : "S-(4-bromophenyl)-L-cysteine",
      "type" : "CLASS",
      "meta" : {
        "subsets" : [ "http://purl.obolibrary.org/obo/chebi#3_STAR" ],
        "synonyms" : [ {
          "synonymType" : "http://purl.obolibrary.org/obo/chebi#IUPAC_NAME",
          "pred" : "hasExactSynonym",
          "val" : "(2R)-2-amino-3-[(4-bromophenyl)sulfanyl]propanoic acid",
          "xrefs" : [ "IUPAC" ]
        }, {
          "pred" : "hasExactSynonym",
          "val" : "S-(4-Bromophenyl)-L-cysteine",
          "xrefs" : [ "KEGG_COMPOUND" ]
        }, {
          "synonymType" : "http://purl.obolibrary.org/obo/chebi#IUPAC_NAME",
          "pred" : "hasExactSynonym",
          "val" : "S-(4-bromophenyl)-L-cysteine",
          "xrefs" : [ "IUPAC" ]
        } ],
...

but the latter does not:

~/kg-phenio$ head -1 data/transformed/phenio/phenio_node_sources_nodes.tsv && grep CHEBI:17736 data/transformed/phenio/phenio_node_sources_nodes.tsv 
id      category        name    description     xref    provided_by     synonym deprecated      iri     same_as subsets
CHEBI:17736     biolink:ChemicalEntity  S-(4-bromophenyl)-L-cysteine            Beilstein:3204780|KEGG:C03900   infores:chebi   (2R)-2-amino-3-[(4-bromophenyl)sulfanyl]propanoic acid|S-(4-Bromophenyl)-L-cysteine|S-(4-bromophenyl)-L-cysteine                http://purl.obolibrary.org/obo/CHEBI_17736              3_STAR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants