Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gap-fill option to subset method in extract #1128

Open
cmungall opened this issue Jul 7, 2023 · 3 comments
Open

Add gap-fill option to subset method in extract #1128

cmungall opened this issue Jul 7, 2023 · 3 comments

Comments

@cmungall
Copy link
Contributor

cmungall commented Jul 7, 2023

I am creating this issue as a continuation of @gouttegd's comment here:

robot extract has a subset method described here, which is most useful. In the terminology of owltools, this method implements "gap spanning". owltools had an additional option when making subsets, "gap filling". This would add all intermediate nodes and edges between terms in the subset.

this can be illustrated using the test subset ontology

let's make a subset:

cat > SUBSET
ONT:1
ONT:4

we can then look at the difference between the two commands:

owltools  ./docs/examples/subset.obo --extract-ontology-subset -i SUBSET --fill-gaps -o -f obo /tmp/gap-filled.obo
owltools  ./docs/examples/subset.obo --extract-ontology-subset -i SUBSET  -o -f obo /tmp/gap-spanned.obo

gap-spanned.obo:

format-version: 1.2
subsetdef: foo "foo"
ontology: test-subset

[Term]
id: ONT:1
subset: foo

[Term]
id: ONT:4
relationship: part_of ONT:1

[Typedef]
id: overlaps
xref: RO:0002131

[Typedef]
id: part_of
xref: BFO:0000050
is_transitive: true

note that the part-of between 4 and 1 is not asserted, it is entailed

gap-filled:

format-version: 1.2
subsetdef: foo "foo"
ontology: test-subset

[Term]
id: ONT:1
subset: foo

[Term]
id: ONT:2
relationship: part_of ONT:1

[Term]
id: ONT:3
relationship: part_of ONT:2

[Term]
id: ONT:4
relationship: part_of ONT:3

[Typedef]
id: overlaps
xref: RO:0002131

[Typedef]
id: part_of
xref: BFO:0000050
is_transitive: true

Proposal:

either a

  • new sibling --method option called something like intermediate-filled-subset
  • new option on extract that is only applicable for subset called something like --include-intermediate-nodes

(we should name this carefully, the owltools terminology of gap filling/spanning is not great)

unlike the default subset approach which connects subset terms via entailed edges, this would traverse all intermediate nodes via direct edges.

Algorithm:

SubsetWithIntermediates(O,S,P):
  S' = S
  for each t in O-S:
    if there exists t1, t2 in S such that <t1 P t>, <t P t2>:
       add t to S'
  O' = {}
  for e in RGdirect(O):
    if e.s in S' and s.o in S' then add e to O`

Here <s P o> means that there exists a relation graph direct or indirect edge <s p o> such that p is in P or p is rdfs:subClassOf

@dosumis
Copy link

dosumis commented Jul 7, 2023

This would be useful (as would a general specification of an algo for graph traversal using the UberGraph redundant graph).

@dosumis
Copy link

dosumis commented Jul 7, 2023

Of your two options, this is clearer to me:

  • new option on extract that is only applicable for subset called something like --include-intermediate-nodes

@matentzn
Copy link
Contributor

matentzn commented Aug 5, 2023

I feel very uncomfortable committing to implementing this.. @dosumis could you we put this on @hkir-dev plate perhaps? I have just assigned 12 robot issues to myself, wanting to do a push.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants