Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(zql): callable relationships #3629

Merged
merged 8 commits into from
Jan 30, 2025
Merged

Conversation

grgbkr
Copy link
Contributor

@grgbkr grgbkr commented Jan 28, 2025

The Exist operators #fetchNodeForRow is quite expensive and is a work around for the fact that Node's relationships can only be consumed a single time. In order to read the relevant relationship, Exist has to fetch from source the parent Node. This fetch is pure overhead, we are fetching a row we already have.

This change updates Node's relationships to be callable, so that they can be consumed multiple times.

Node's type changes from

export type Node = {
  row: Row;
  relationships: Record<string, Stream<Node>>;
};

to

export type Node = {
  row: Row;
  relationships: Record<string, () => Stream<Node>>;
};

This change also updates ChildChange to have a node: Node rather than a row: Row so that the relationships needed by Exist are also available on ChildChange. Now all Change types use Node rather than Row.

Finally this addresses an issue with cleanup where Node's filtered by Exists or Filter were not being drained (as they never reach view-apply-change which does the draining). Added draining to cleanup in Exists and Filter for Nodes not yielded to downstream.

Copy link

vercel bot commented Jan 28, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
replicache-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 2:19am
zbugs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 30, 2025 2:19am

Copy link

github-actions bot commented Jan 28, 2025

🐰 Bencher Report

Branchgrgbkr/callable-relationships-2
Testbedlocalhost
Click to view all benchmark results
BenchmarkFile SizeBenchmark Result
kilobytes (KB)
(Result Δ%)
Upper Boundary
kilobytes (KB)
(Limit %)
zero-package.tgz📈 view plot
🚷 view threshold
919.01
(-0.04%)
937.79
(98.00%)
zero.js📈 view plot
🚷 view threshold
171.32
(-0.07%)
174.87
(97.97%)
zero.js.br📈 view plot
🚷 view threshold
47.71
(-0.09%)
48.71
(97.95%)
🐰 View full continuous benchmarking report in Bencher

Copy link

github-actions bot commented Jan 28, 2025

🐰 Bencher Report

Branchgrgbkr/callable-relationships-2
Testbedlocalhost
Click to view all benchmark results
BenchmarkThroughputBenchmark Result
operations / second (ops/s)
(Result Δ%)
Lower Boundary
operations / second (ops/s)
(Limit %)
src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers)📈 view plot
🚷 view threshold
73.93
(+2.16%)
69.71
(94.29%)
src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers)📈 view plot
🚷 view threshold
93.41
(-1.12%)
91.05
(97.47%)
🐰 View full continuous benchmarking report in Bencher

@tantaman
Copy link
Contributor

tantaman commented Jan 29, 2025

I realize this isn't done yet but it currently looks amazing in terms of perf gains.

before:
image

after:
image

@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from 79f311f to e5f86e6 Compare January 29, 2025 19:05
@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from e5f86e6 to 2bcec65 Compare January 29, 2025 19:55
@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from 2bcec65 to da36f04 Compare January 29, 2025 19:59
@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from 9df484d to 19894b1 Compare January 29, 2025 20:29
@grgbkr grgbkr marked this pull request as ready for review January 29, 2025 21:35
@grgbkr grgbkr requested a review from tantaman January 29, 2025 21:53
@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from 19894b1 to 9cc0cd8 Compare January 29, 2025 22:28
@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from e5d4576 to cc73c6b Compare January 29, 2025 23:39
@aboodman
Copy link
Contributor

What's one more layer of indirection between friends 🙃

@grgbkr grgbkr force-pushed the grgbkr/callable-relationships-2 branch from 07dcb86 to d74e321 Compare January 30, 2025 02:18
@grgbkr grgbkr merged commit c952b11 into main Jan 30, 2025
11 checks passed
@grgbkr grgbkr deleted the grgbkr/callable-relationships-2 branch January 30, 2025 02:23
Copy link
Contributor

@tantaman tantaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Doesn't look like much logic changed (except for exists and some wrapping of the relationship in join with a lambda), mostly just changing how invocations of relationships work.

lmk if I missed a significant change.

*/
export type ChildChange = {
type: 'child';
row: Row;
node: Node;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@arv
Copy link
Contributor

arv commented Jan 30, 2025

Added draining to cleanup in Exists and Filter for Nodes not yielded to downstream.

Do we need to do something with FanIn/FanOut when it has no output?

@grgbkr
Copy link
Contributor Author

grgbkr commented Jan 31, 2025

Added draining to cleanup in Exists and Filter for Nodes not yielded to downstream.

Do we need to do something with FanIn/FanOut when it has no output?

Good point... yeah I think my change is not quite right because I wasn't thinking about FanIn/FanOut... a single Filter shouldn't decide to drain the node because other Filter's may let it through... hmmm

Copy link
Contributor

@tantaman tantaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had a stale pending comment about drain

CleanShot 2025-01-31 at 13 47 58

yield node;
} else {
drainStreams(node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah, I don't understand why we have to drainStreams here and in filter.

Copy link
Contributor Author

@grgbkr grgbkr Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an existing problem with cleanup, that I realized while updating tests for this change.

Cleanup is lazy/stream based so that it can be limited by take. Cleanup relies on something downstream draining the streams for the cleanup to actually happen.

If an operator filters out a node (as Exist or Filter does) that node is not going to reach the thing downstream that normally would drain the streams, and the cleanup for that part of the tree won't happen. The idea behind this change was to ensure the cleanup by draining the node's stream at the point it is filtered out.

My change is not quite right though, as now the node may get drained multiple times when we have a FanIn/FanOut as multiple filter/exists operators may filter it (each of them draining it), but then it may still get passed downstream if another branch does not filter it (for it to be drained another time).

Still thinking about how to address it.

cc @arv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants