-
Notifications
You must be signed in to change notification settings - Fork 212
Use built_value
for AssetNode
and related types.
#3914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR HealthChangelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. |
8c0dc7d
to
129e99b
Compare
I would advise against using any codegen for these classes, it is a bit annoying sure to hand write the serialization, but it is much more of a pain in general to have code generator deps in build_runner itself (or actually anything that build_runner depends on). Basically, you consistently end up in situations where you cannot run build_runner at all because these files are either out of date or have been deleted, so you can't compile, but you need to compile to run the build, etc. We made this mistake in build_modules and build_config and it is really quite annoying sometimes. On another note is our serialization is already quite slow, so please benchmark this with large asset graphs, in previous testing I saw built value being about 50% slower than normal JSON deserialization. We also probably want to leave open the option for using a different format than JSON for the asset graph. |
That's definitely a concern. I wrote a script that runs build_runner using deps from pub instead of the local versions; that seems to do the job.
I am expecting that this will end up much, much smaller+faster.
Thanks. |
Do compare against json_serializable or a similar package. I definitely have a bias against anything builder pattern and I know you own built_value so it's a default for you, but we can easily resolve that by just measuring it. If it's not as much slower as I think then it's fine. |
I think this is the right way to go independent of serialization: immutable data is ideal for adding optimizations that reuse subgraphs. That should shrink the 20Mb for the 1000 node benchmark to more like 20k. Real cases with less perfect shared subgraphs will gain less, of course, but they should still gain a lot. Actual serialization performance should end up being much less important, I can look into it if it still shows up on the benchmarks. |
I don't have anything against using immutable data here, it's the doubling of allocations needed to create/copy/update any object when using the builder pattern that I am concerned about. This applies to not just serialization but all object creation. |
Reducing copying is one of the big features of |
4bada91
to
a2e7940
Compare
built_value
for AssetNode
built_value
for AssetNode
and related types.
e17b0ec
to
e436172
Compare
e436172
to
d2f36bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certainly don't like the runtime increasing by 160%, and I'm not 100% sold on generated code like this.
I'm willing to let this run though, but even after a future, hopefully big, decrease in runtime, we should probably be prepared to walk this back if it shows up in profiles (while I get that the overhead will likely be much less in the future scenario, performance is also avoiding doing stuff that takes longer than the alternative).
.toList(); | ||
|
||
for (final node in potentialNodes) { | ||
for (final potentialId in potentialIds) { | ||
final node = _assetGraph.get(potentialId)!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this node be different from the node we extracted the id from above? (aka I don't understand why we map it to the id, then get the node based on id?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to avoid having locals around that would become stale after the _buildAsset
in the next loop.
This whole method can be improved--added comments and some for loops instead of where / where / map, hopefully this looks better :) thanks.
@@ -991,68 +1021,81 @@ class _SingleBuild { | |||
for (var output in outputs) { | |||
var wasOutput = readerWriter.assetsWritten.contains(output); | |||
var digest = wasOutput ? await _readerWriter.digest(output) : null; | |||
var node = _assetGraph.get(output)!; | |||
|
|||
// **IMPORTANT**: All updates to `node` must be synchronous. With lazy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this no longer "IMPORTANT"? Or should the comment be kept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the nodes are now immutable, new nodes arrive in the graph atomically, and old references to nodes don't see any update until they fetch again.
Added to that, we only run one generator at a time now.
So I think it would be very hard to hit any issue :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I certainly don't like the runtime increasing by 160%, and I'm not 100% sold on generated code like this.
I'm willing to let this run though, but even after a future, hopefully big, decrease in runtime, we should probably be prepared to walk this back if it shows up in profiles (while I get that the overhead will likely be much less in the future scenario, performance is also avoiding doing stuff that takes longer than the alternative).
Entirely reasonable--thanks.
@@ -991,68 +1021,81 @@ class _SingleBuild { | |||
for (var output in outputs) { | |||
var wasOutput = readerWriter.assetsWritten.contains(output); | |||
var digest = wasOutput ? await _readerWriter.digest(output) : null; | |||
var node = _assetGraph.get(output)!; | |||
|
|||
// **IMPORTANT**: All updates to `node` must be synchronous. With lazy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the nodes are now immutable, new nodes arrive in the graph atomically, and old references to nodes don't see any update until they fetch again.
Added to that, we only run one generator at a time now.
So I think it would be very hard to hit any issue :)
.toList(); | ||
|
||
for (final node in potentialNodes) { | ||
for (final potentialId in potentialIds) { | ||
final node = _assetGraph.get(potentialId)!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to avoid having locals around that would become stale after the _buildAsset
in the next loop.
This whole method can be improved--added comments and some for loops instead of where / where / map, hopefully this looks better :) thanks.
For #3811, review and merge #3908 first.
built_value
gives a way to represent immutable data and serialize it, which looks like a very good fit here.The code is closely equivalent to what was there before; the biggest change is that because
AssetNode
is now immutable it's necessary to fetch updated nodes rather than getting mutations "for free".This is still an intermediate state: the next PR can switch to using
built_value
for serialization and test (value comparison) code, then it becomes much easier to make changes to the data model.As expected this is worse for performance
which is fine: the
AssetGraph
updates that are expensive were already too expensive before this PR, it's exactly duplicate work in these updates that needs addressing.