[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644

petar-qb · 2024-08-21T14:54:17Z

Description

References:

PR link that introduced returning the entire data_frame from the AgGrid/Table build method -> #439
Reasons behind the mentioned PR:

Why are we rethinking the AgGrid/Table build method?

Short reminder:

There are two phases that happen when the page is loaded:

build phase -> renders all page components by returning the content made inside the components build method.
on_page_load phase -> triggers a process that calculates the final appearance of all targeted figures based on the selected filters/parameters/filter_interactions.

Even when the big data is handled and showed with different figures, the build phase has to be completed quickly, so Vizro users don't stuck with the "blank white page". Since the calculation for the final figures look could take a lot of time (e.g. if the data loading process is slow or if big data has to be transported through the network), we should return some light-weight placeholder components that will be replaced with the real components afterwards.

After the build phase, the on_page_load mechanism is triggered which immediately puts all the targeted figures in the "loading state" and after some time it returns and renders filtered & parametrised figures to the UI.

How it works now?

Currently the vm.Graph.build method works as expected. It returns a placeholder graph from the build phase which is later replaced with the actual (filtered and parametrised) Graph object (result of the on_page_load mechanism) later.
However, the vm.AgGrid.build and the vm.Table.build methods returns objects made of entire (non-filtered) data_frame and original parameters from the build phase which becomes replaced with the actual AgGrid/Table objects (result of the on_page_load mechanism).

(We decided to return entire data_frames for ag_grid/tables just to quick fix the bugs mentioned at the beginning of this description.)

What do we want to achieve?

We want to optimise the AgGrid and Table build methods so they do not return an entire data_frame in build phase. Except the response time optimisation (which is the most important I guess), we also want to align them with the Graph build method that also does not return an entire data_frame from the build method. In this process we shouldn't recreate any of the persistence (or similar) bugs we solved with the #439.

Vizro UX will hugely benefit from this "relieving the initial page loading" process, and this step is crucial for all the upcoming backend changes.

Task acceptance criteria:

Relieve the page.build phase:
- Ideally by returning a hardcoded placeholder component from the AgGrid/Table build methods (similarly like it's done for the vm.Graph)
- OR by returning a real (fully configured) AgGrid/Table object but with an empty data_frame. (see in the table below why this is not optimal solution. This option is described within the "self.call(data_frame=pd.DataFrame())" column) .
Do not introduce any persistence (or similar) inconsistencies like:
- Reseting AG grid resets column widths when reloading the screen (if the columnSize="autoSize"),
- Column filter persistence,
- Not sending an empty data_frame into figure functions,
- Make the pagination and selectedRows to work smoothly (together with all other possible arguments).
Optimise the number of figure function calls. Ideally it should be called only once from the on_page_load phase, with the filtered data_frame and parameterised function arguments.

The table below shows how different objects returned from the vm.AgGrid build method (columns) match different acceptance criteria (rows):

Returned objects like dag.AgGrid() / html.Div() are not considered at all as they raise the following error (even with suppress_callback_exceptions=True set) -> "A nonexistent object was used in an State of a Dash callback...".

	dag.AgGrid(id=self._input_component_id)	self.call(data_frame=pd.DataFrame())	self.call()	html.Div(id=self._input_component_id)
Returns a light-weigh component from the build phase	✅	✅	❌	✅
Keeps column widths on page refresh	❌	❌	*❌ (we thought we fixed it)	✅
Keeps column widths on page nav	❌	❌	*❌ (we thought we fixed it)	✅
Keeps column widths on filter changes	❌	❌	❌	*❌ (probably internal dash-ag-grid bug)
"Column filter" persistence	✅	❌	✅	✅
"pagination" persistence	❌	✅	✅	✅
"selectedRows" persistence	✅	❌	❌	✅
Not sending an empty data_frame into figure functions	✅	❌	✅	✅
number of figure function calls - build phase	0 - ✅	2 - ❌	2 - ❌	0 - ✅
number of figure function calls - on_page_load phase	1 - ✅	2 - ❌	2 - ❌	1 - ✅

Legend:

*❌ (we thought we fixed it) - points out that the https://github.com/McK-Internal/vizro-internal/issues/670 is not actually fully solved. To prove it, run the scratch_dev example, select filter [1, 3], and refresh the page.
*❌ (probably internal dash-ag-grid bug) - See the issue Column widths incorrectly calculated based on previous rowData when using autoSize in AgGrid plotly/dash-ag-grid#318

Conclusions:

Based on all the knowledge gathered so far (which is mainly presented in the table above), we will go with the returned object html.Div(id=self._input_component_id) option from the vm.AgGrid/vm.Table build methods.

Notice

I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":
- I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
- I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
- I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
- I have not referenced individuals, products or companies in any commits, directly or indirectly.
- I have not added data or restricted code in any commits, directly or indirectly.

for more information, see https://pre-commit.ci

vizro-core/src/vizro/models/_components/figure.py

vizro-core/src/vizro/_vizro.py

vizro-core/src/vizro/actions/_actions_utils.py

vizro-core/tests/unit/vizro/tables/test_dash_ag_grid.py

…turn_empty_data_frame_from_tables_build

antonymilne

👏 👏 👏 Really awesome work - great investigation and write-up. I didn't actually test it but I trust your manual tests + if @maxschulz-COL can give it a quick spin!

I would love to get to the bottom of the question I had about why this doesn't trigger the filter_interaction since as you know it influences other things like the dynamic controls. But if it all works then happy to merge this without fully understanding it!

Just for other reviewers and to have it all in one place, there's a couple of things worth thinking about here.

Firstly, consider the case when you build a dashboard using dynamic data and it's running for 5 years. The data could have completely changed between the time you start the dashboard (run pre_build and then first run build) and the time you visit it (re-run build). Hence any kind of pre-calculated AgGrid or Graph or whatever might be completely irrelevant and should not be cached between pre-build and build time.

Secondly, operations that run when you go to a page with an AgGrid are basically:

data loading
backend running of dag.AgGrid
network time
frontend rendering

(plus some overhead with server handling routing etc.)

Previously I had always assumed that (1) would be the bottleneck here. This was wrong since in @maxschulz-COL and @huong-li-nguyen examples the network time was actually very significant. So In actual fact:

1 is quick for static data, potentially slow for dynamic data
3 is quick for small data, potentially slow for large data
2 and 4 are quick

What this means is that in order for build methods to me fast they need to avoid both 1 and 3. What that boils down to is a very important rule that build methods must not load data.

I'm slightly sad that that the more we work on this, the less "useful content" is in AgGrid and similar models and the more it just becomes the same as Figure + the "action info" type stuff. My original intention was that the build method would be a bit more "real" in encoding the actual Dash component... but I am 100% in agreement now that this is the right way to do it. So it's great that we now have a good solution and a very clear direction and it's obvious what the right split between build vs. __call__ is, which has been an open question for a loooong time. Definitely we're making good progress here! 💯

vizro-core/src/vizro/_vizro.py

...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md

…turn_empty_data_frame_from_tables_build

for more information, see https://pre-commit.ci

maxschulz-COL

⭐ 🕵️ Fantastic investigation and great implementation!!

I reviewed the code in detail, and I like it, well done! I have not tested the entire table you showed, I take it you did that plenty, but trust that this is therefore the best solution. Giving it a short spin, this is quite nice!

I added another optiimzation that I could find, it is quite significant. If you agree, then let's add that one as well 💪

After that consideration, good to merge from my side

...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md

vizro-core/tests/unit/vizro/tables/test_dash_ag_grid.py

vizro-core/src/vizro/tables/_dash_ag_grid.py

…turn_empty_data_frame_from_tables_build

petar-qb · 2024-08-26T09:55:43Z

@antonymilne

I'm slightly sad that that the more we work on this, the less "useful content" is in AgGrid and similar models and the more it just becomes the same as Figure + the "action info" type stuff

Just to say that absence of custom code across different models makes me really happy. This enhances the genericity of the codebase, making it more maintainable, easier to understand and easier to customise from the users' side.

The "actions_info" is an idea I got earlier this year while documenting ActionsV2 process and is placed on the Miro board as the first ticket on the AV2 Timeline.

…turn_empty_data_frame_from_tables_build

petar-qb · 2024-08-27T05:56:01Z

@antonymilne

I would love to get to the bottom of the question I had about why this doesn't trigger the filter_interaction since as you know it influences other things like the dynamic controls. But if it all works then happy to merge this without fully understanding it!

After the on_page_load action returns figures on the page, the trigger_to_global_store is triggered. This client-side callback is triggered as expected according to -> https://dash.plotly.com/advanced-callbacks#prevent-callback-execution-upon-initial-component-render, because the callback Input is generated by the on_page_load and its Output is a global local store (that already exists in the client side of the app).

This should cause the gateway client-side callback to be triggered as well, since the trigger_to_global_store Output is the Input of the gateway callback (which would cause filter_interaction to be triggered). 🤔 However, I didn't succeed to figure out why the gateway callback is not triggered (as it should be), and wasn't able to trigger the gateway callback even when I removed all the prevent_initial_call=True parameters.

antonymilne · 2024-08-27T08:58:23Z

After the on_page_load action returns figures on the page, the trigger_to_global_store is triggered. This client-side callback is triggered as expected according to -> https://dash.plotly.com/advanced-callbacks#prevent-callback-execution-upon-initial-component-render, because the callback Input is generated by the on_page_load and its Output is a global local store (that already exists in the client side of the app).

This should cause the gateway client-side callback to be triggered as well, since the trigger_to_global_store Output is the Input of the gateway callback (which would cause filter_interaction to be triggered). 🤔 However, I didn't succeed to figure out why the gateway callback is not triggered (as it should be), and wasn't able to trigger the gateway callback even when I removed all the prevent_initial_call=True parameters.

Thanks for looking into it. I wonder if it's related to this issue? https://github.com/McK-Internal/vizro-internal/issues/1105#issue-2378867831

No need to solve it now, but let's make a note of it so we don't lose track of it.

petar-qb · 2024-08-27T10:06:35Z

I wonder if it's related to this issue?

Great point! It seems like we're dealing with a similar scenario here.

My main goal is to prevent any action from triggering another implicitly (which seems like it's the case right now). Instead, all actions should be explicitly defined using action chains. Given that, I wouldn't suggest altering the current behaviour. However, I'd like to dedicate more time to investigating this further and making the action loop more robust. We can address this in a separate PR when we find more time.

petar-qb added 6 commits August 8, 2024 11:59

How to reproduce a ag_grid bug

9df7be6

AgGrid tested

b9a0289

First version

9406791

Deleting unnecessary my_dev example

a711948

Returned empty data_frame from figure component build method

83000ea

Conflicts solved

77871f8

petar-qb requested review from Joseph-Perkins, antonymilne, huong-li-nguyen and maxschulz-COL as code owners August 21, 2024 14:54

petar-qb and others added 2 commits August 21, 2024 17:03

changelog file added

f2c21f9

[pre-commit.ci] auto fixes from pre-commit.com hooks

cb8e8e5

for more information, see https://pre-commit.ci

petar-qb commented Aug 21, 2024

View reviewed changes

Revert a minor change

68ae21c

petar-qb changed the title ~~Feat/return empty data frame from tables build~~ [Tidy] Return empty data frame from tables build Aug 21, 2024

petar-qb changed the title ~~[Tidy] Return empty data frame from tables build~~ [Tidy] Return empty data frame from AgGrid, Table and Figure build methods Aug 21, 2024

petar-qb added 2 commits August 22, 2024 09:55

Reverting Figure build comment about self.id location

d21e46b

changelog typo

90c54da

petar-qb commented Aug 22, 2024

View reviewed changes

vizro-core/tests/unit/vizro/tables/test_dash_ag_grid.py Show resolved Hide resolved

petar-qb added 3 commits August 22, 2024 10:34

Minor changes

7ef8b6d

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

8368e6d

…turn_empty_data_frame_from_tables_build

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

5d650ae

…turn_empty_data_frame_from_tables_build

petar-qb self-assigned this Aug 22, 2024

petar-qb mentioned this pull request Aug 22, 2024

Custom action on Page loading #643

Closed

1 task

antonymilne approved these changes Aug 22, 2024

View reviewed changes

vizro-core/src/vizro/_vizro.py Show resolved Hide resolved

vizro-core/src/vizro/_vizro.py Show resolved Hide resolved

...-core/changelog.d/20240821_165515_petar_pejovic_return_empty_data_frame_from_tables_build.md Outdated Show resolved Hide resolved

petar-qb and others added 3 commits August 23, 2024 08:56

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

a42ec2d

…turn_empty_data_frame_from_tables_build

Additional optimization for AgGrid

006ba35

[pre-commit.ci] auto fixes from pre-commit.com hooks

44fd247

for more information, see https://pre-commit.ci

maxschulz-COL approved these changes Aug 23, 2024

View reviewed changes

Addressing PR comments

2596a87

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

8877bb9

…turn_empty_data_frame_from_tables_build

petar-qb requested a review from maxschulz-COL August 26, 2024 09:57

petar-qb mentioned this pull request Aug 26, 2024

[Demo] Replace KPI cards with dynamic figure functions #648

Merged

2 tasks

petar-qb added 2 commits August 26, 2024 13:49

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

639e175

…turn_empty_data_frame_from_tables_build

Reverting dash_ag_grid columnDefs optimisation

ce00213

maxschulz-COL approved these changes Aug 26, 2024

View reviewed changes

Merge branch 'main' of https://github.com/mckinsey/vizro into feat/re…

30f5917

…turn_empty_data_frame_from_tables_build

petar-qb merged commit 84c0ffc into main Aug 27, 2024
32 checks passed

petar-qb deleted the feat/return_empty_data_frame_from_tables_build branch August 27, 2024 10:10

This was referenced Sep 3, 2024

[POC] KPI Infinite scroll AgGrid #519

Draft

[Feat] Enable header and footer inside Graph, Table and AgGrid #669

Merged

petar-qb mentioned this pull request Sep 19, 2024

AgGrid doesn't work with persistence props #389

Closed

1 task

petar-qb mentioned this pull request Nov 12, 2024

[Bug] Fix theme flickering and 404 page #865

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644

[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644

petar-qb commented Aug 21, 2024 •

edited

Loading

antonymilne left a comment •

edited

Loading

maxschulz-COL left a comment

petar-qb commented Aug 26, 2024

petar-qb commented Aug 27, 2024

antonymilne commented Aug 27, 2024

petar-qb commented Aug 27, 2024

[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644

[Tidy] Return empty data frame from AgGrid, Table and Figure build methods #644

Conversation

petar-qb commented Aug 21, 2024 • edited Loading

Description

Why are we rethinking the AgGrid/Table build method?

Short reminder:

How it works now?

What do we want to achieve?

Conclusions:

Notice

antonymilne left a comment • edited Loading

Choose a reason for hiding this comment

maxschulz-COL left a comment

Choose a reason for hiding this comment

petar-qb commented Aug 26, 2024

petar-qb commented Aug 27, 2024

antonymilne commented Aug 27, 2024

petar-qb commented Aug 27, 2024

petar-qb commented Aug 21, 2024 •

edited

Loading

antonymilne left a comment •

edited

Loading