-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644
[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644
Conversation
for more information, see https://pre-commit.ci
…turn_empty_data_frame_from_tables_build
…turn_empty_data_frame_from_tables_build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 👏 👏 Really awesome work - great investigation and write-up. I didn't actually test it but I trust your manual tests + if @maxschulz-COL can give it a quick spin!
I would love to get to the bottom of the question I had about why this doesn't trigger the filter_interaction
since as you know it influences other things like the dynamic controls. But if it all works then happy to merge this without fully understanding it!
Just for other reviewers and to have it all in one place, there's a couple of things worth thinking about here.
Firstly, consider the case when you build a dashboard using dynamic data and it's running for 5 years. The data could have completely changed between the time you start the dashboard (run pre_build
and then first run build
) and the time you visit it (re-run build
). Hence any kind of pre-calculated AgGrid or Graph or whatever might be completely irrelevant and should not be cached between pre-build and build time.
Secondly, operations that run when you go to a page with an AgGrid are basically:
- data loading
- backend running of dag.AgGrid
- network time
- frontend rendering
(plus some overhead with server handling routing etc.)
Previously I had always assumed that (1) would be the bottleneck here. This was wrong since in @maxschulz-COL and @huong-li-nguyen examples the network time was actually very significant. So In actual fact:
- 1 is quick for static data, potentially slow for dynamic data
- 3 is quick for small data, potentially slow for large data
- 2 and 4 are quick
What this means is that in order for build
methods to me fast they need to avoid both 1 and 3. What that boils down to is a very important rule that build
methods must not load data.
I'm slightly sad that that the more we work on this, the less "useful content" is in AgGrid
and similar models and the more it just becomes the same as Figure
+ the "action info" type stuff. My original intention was that the build
method would be a bit more "real" in encoding the actual Dash component... but I am 100% in agreement now that this is the right way to do it. So it's great that we now have a good solution and a very clear direction and it's obvious what the right split between build
vs. __call__
is, which has been an open question for a loooong time. Definitely we're making good progress here! 💯
...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md
Outdated
Show resolved
Hide resolved
…turn_empty_data_frame_from_tables_build
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⭐ 🕵️ Fantastic investigation and great implementation!!
I reviewed the code in detail, and I like it, well done! I have not tested the entire table you showed, I take it you did that plenty, but trust that this is therefore the best solution. Giving it a short spin, this is quite nice!
I added another optiimzation that I could find, it is quite significant. If you agree, then let's add that one as well 💪
After that consideration, good to merge from my side
...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md
Outdated
Show resolved
Hide resolved
...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md
Outdated
Show resolved
Hide resolved
...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md
Outdated
Show resolved
Hide resolved
…turn_empty_data_frame_from_tables_build
Just to say that absence of custom code across different models makes me really happy. This enhances the genericity of the codebase, making it more maintainable, easier to understand and easier to customise from the users' side. The "actions_info" is an idea I got earlier this year while documenting ActionsV2 process and is placed on the Miro board as the first ticket on the AV2 Timeline. |
…turn_empty_data_frame_from_tables_build
…turn_empty_data_frame_from_tables_build
After the on_page_load action returns figures on the page, the trigger_to_global_store is triggered. This client-side callback is triggered as expected according to -> https://dash.plotly.com/advanced-callbacks#prevent-callback-execution-upon-initial-component-render, because the callback This should cause the gateway client-side callback to be triggered as well, since the |
Thanks for looking into it. I wonder if it's related to this issue? https://github.com/McK-Internal/vizro-internal/issues/1105#issue-2378867831 No need to solve it now, but let's make a note of it so we don't lose track of it. |
Great point! It seems like we're dealing with a similar scenario here. My main goal is to prevent any action from triggering another |
Description
References:
Why are we rethinking the AgGrid/Table build method?
Short reminder:
There are two phases that happen when the page is loaded:
build
phase -> renders all page components by returning the content made inside the components build method.on_page_load
phase -> triggers a process that calculates the final appearance of all targeted figures based on the selected filters/parameters/filter_interactions.Even when the big data is handled and showed with different figures, the
build
phase has to be completedquickly
, so Vizro users don't stuck with the "blank white page". Since the calculation for the final figures look could take a lot of time (e.g. if the data loading process is slow or if big data has to be transported through the network), we should return some light-weightplaceholder components
that will be replaced with the real components afterwards.After the build phase, the
on_page_load
mechanism is triggered which immediately puts all the targeted figures in the "loading state" and after some time it returns and renders filtered & parametrised figures to the UI.How it works now?
Currently the
vm.Graph.build
method works as expected. It returns aplaceholder graph
from thebuild
phase which is later replaced with the actual (filtered and parametrised) Graph object (result of the on_page_load mechanism) later.However, the
vm.AgGrid.build
and thevm.Table.build
methods returns objects made ofentire
(non-filtered) data_frame and originalparameters
from thebuild phase
which becomes replaced with the actual AgGrid/Table objects (result of the on_page_load mechanism).(We decided to return entire data_frames for ag_grid/tables just to quick fix the bugs mentioned at the beginning of this description.)
What do we want to achieve?
We want to optimise the AgGrid and Table build methods so they
do not
return an entire data_frame in build phase. Except the response time optimisation (which is the most important I guess), we also want to align them with the Graph build method that alsodoes not
return an entire data_frame from the build method. In this process we shouldn't recreate any of the persistence (or similar) bugs we solved with the #439.Vizro UX will hugely benefit from this "relieving the initial page loading" process, and this step is crucial for all the upcoming backend changes.
Task acceptance criteria:
page.build
phase:pagination
andselectedRows
to work smoothly (together with all other possible arguments).on_page_load
phase, with the filtered data_frame and parameterised function arguments.The table below shows how different objects returned from the vm.AgGrid build method (
columns
) match different acceptance criteria (rows
):Returned objects like
dag.AgGrid()
/html.Div()
are not considered at all as they raise the following error (even withsuppress_callback_exceptions=True
set) -> "A nonexistent object was used in anState
of a Dash callback...".Legend:
*❌ (we thought we fixed it)
- points out that the https://github.com/McK-Internal/vizro-internal/issues/670 is not actually fully solved. To prove it, run thescratch_dev
example, select filter [1, 3], and refresh the page.*❌ (probably internal dash-ag-grid bug)
- See the issue Column widths incorrectly calculated based on previous rowData when using autoSize in AgGrid plotly/dash-ag-grid#318Conclusions:
Based on all the knowledge gathered so far (which is mainly presented in the table above), we will go with the returned object
html.Div(id=self._input_component_id)
option from the vm.AgGrid/vm.Table build methods.Notice
I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":