Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeDB 3.0: all query clauses are chainable streaming operations #7037

Open
flyingsilverfin opened this issue Apr 11, 2024 · 0 comments
Open

Comments

@flyingsilverfin
Copy link
Member

flyingsilverfin commented Apr 11, 2024

Problem to Solve

We aim to make our TypeQL queries more consistent and expressive, reformulating them in terms of streams/iterators commonly known from programming languages, also allowing them to be chained into longer sequences of operations.

Proposed Solution

We aim to have the following type system for different clauses and operations on streams of concept maps (Stream<ConceptMap>). Operations can be freely composed based on this type system.

Production operations

These operations are used to produced streams.

  • match clause: Stream<ConceptMap> -> Stream<ConceptMap>
    • Acts like a 'flatMap' in general (for each input concept map, continue the output stream with 0 or more concept maps).
    • The first match clause can be consistently thought of mapping from an empty stream.
    • always returns unique stream elements, i.e., streams are de-duplicated.

While match can operate on an empty input stream, the following operations cannot. We also refer to them as modifying operations. They are all of type Stream<ConceptMap> -> Stream<ConceptMap>.

  • filter $x, $y : filter + distinct the input concept map stream to a smaller concept map stream. Note that the variables that we filter (e.g. $x, $y) must exist in the concept maps of the input stream.
  • skip N: replaces offset with a better term indicating costliness.
  • take N: replaces limit with a term that is consistent with skip.
  • sort $x, $y: sorts the stream on the given variables (in order). Sometimes this may be optimized to avoid collecting all answers and re-emitting them as a stream.

Effect operations

These operations are called primarily for their effect on the DB (rather than on the stream). This effect may also affect the stream.

  • insert clause: Stream<ConceptMap> -> Stream<ConceptMap>
    • Each input concept map is possibly augumented with new concepts, which were inserted during the insert clause.
    • Always returns unique stream elements.
  • delete clause: Stream<ConceptMap> -> Stream<ConceptMap>
    • Each input concept map is possibly shrunk as concepts are deleted from the map, in the exact inverse of the insert clause.
    • Outputs are non unique - if all concepts are deleted, a stream of empty concept maps is produced. These are still useful - they can be counted!
    • If a delete statement contains an @cascade annotation then deletes will cascade based on the original input stream, i.e., we will cascade through potentially already deleted objects.
  • put clause: Stream<ConceptMap> -> Stream<ConceptMap>
    • the clause put P for a pattern P does the following: if match P yields an empty stream then execute insert P on the original input stream otherwise do nothing (keeping the executed match P).

Aggregation operations

These operations aggregate a stream into a single object (e.g. a value, a list of values, or a list of JSONs). These operation end a query clause chain.

  • check clause: Stream<ConceptMap> -> bool
    • A new clause that just checks for the existence of any answers, and returns this as a boolean
  • reduce clause: Stream<ConceptMap> -> (... values ...), replacing the get; aggregate fn; clauses.
    • The reduce clause applies one or more reducing operators to the input stream:
      • count(var) -> long: returns 0 for 0 element stream
      • min/mean/max/sum/stddev return an optional value that must be handled with a new optional-empty handler
    • example match $x... $a... $b...; reduce count(), min($a), mean($b) ?: 0.0 (syntax not finalised)
  • fetch clause: Stream<ConceptMap> -> Stream<JSON>

Control operation

Control operation give the user the ability to control the execution of a chained query.

  • assert: Stream<ConceptMap> -> Result<Stream<ConceptMap>>:
    • when running assert COND @or_print("test failed"); then the output is the input stream if COND is true, and otherwise an error is returned (with customizable error message as indicated by an annotation @or_print).
    • the precise syntax for assert conditions is TBD.

Summary

As a result of these changes, the user will always be able to truncate their query stream at any clause and execute it, without having to do an explicit get clause. This makes get redundant for this operation, and we absorb get's filtering operating into the new filter modifier.

Here's an example of building a longer query stream:

match
  ($x, $y) isa friendship;
// Unique stream<$x, $y>
filter $x; 
// Unique stream<$x>
insert
  $x has name "Alice";
// Unique stream<$x>
match
  $x has email $e; $e == "[email protected]";
// Unique stream<$x, $e>
delete
  $x isa person;
// Non-unique stream<$e>
reduce count;
// returns number of elements in the stream, which is the number of deleted people
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant