Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Fix bug that prevented custom tables/grids with column referral #439

Merged
merged 22 commits into from May 12, 2024

Conversation

maxschulz-COL
Copy link
Contributor

@maxschulz-COL maxschulz-COL commented Apr 26, 2024

Description

Closes #435 and https://github.com/McK-Internal/vizro-internal/issues/747

We now send the full DF during the build phase (instead of an empty pd.DataFrame). This could have performance consequences for very large DFs, but on the grand scheme of things I think it probably the better solution.

And alternative would be to send an empty DF with the columns present, but that may lead to other bugs down the road.

Screenshot

Notice

  • I acknowledge and agree that, by checking this box and clicking "Submit Pull Request":

    • I submit this contribution under the Apache 2.0 license and represent that I am entitled to do so on behalf of myself, my employer, or relevant third parties, as applicable.
    • I certify that (a) this contribution is my original creation and / or (b) to the extent it is not my original creation, I am authorized to submit this contribution on behalf of the original creator(s) or their licensees.
    • I certify that the use of this contribution as authorized by the Apache 2.0 license does not violate the intellectual property rights of anyone else.
    • I have not referenced individuals, products or companies in any commits, directly or indirectly.
    • I have not added data or restricted code in any commits, directly or indirectly.

@maxschulz-COL maxschulz-COL self-assigned this Apr 26, 2024
@maxschulz-COL maxschulz-COL added Status: Ready for Review ☑️ Issue/PR is ready for review - all tests have passed Issue: Bug Report 🐛 Issue/PR that report/fix a bug labels Apr 26, 2024
@huong-li-nguyen
Copy link
Contributor

huong-li-nguyen commented Apr 26, 2024

Thanks for fixing so quickly! 🚀 What "potential bugs down the road" do you imagine? I actually like the alternative solution approach - it's also the one we've implemented on VizX actually. I think it did not lead to any major bugs (as far as I remember), and the performance issue was more severe.

Do we know how other tools deal with this? e.g. do they load in the data in batches? or do they always load in the entire dataset?

@petar-qb petar-qb self-requested a review April 26, 2024 08:31
@maxschulz-COL
Copy link
Contributor Author

maxschulz-COL commented Apr 26, 2024

Thanks for fixing so quickly! 🚀 What "potential bugs down the road" do you imagine? I actually like the alternative solution approach - it's also the one we've implemented on VizX actually. I think it did not lead to any major bugs (as far as I remember), and the performance issue was more severe.

So of the top of my head I could imagine that someone not only refers to a column, but also a specific row. This then would fail, as there are no rows present.

This seems to be a case of a wider range of things where people refer or rely on the size of the data, or generally anything regarding the original DF they provide.

One could argue that it is bad design to write a custom grid/table like that, but the confusion of the user showed that people simply do not expect the data to be different at any point from what they originally provide

Do we know how other tools deal with this? e.g. do they load in the data in batches? or do they always load in the entire dataset?

So the AgGrid for example has the opportunity to enable batch loading in infinite scroll, but I think that defeats a little bit the point. In principle we are not converned with loading the entire data, that is what we do anyway once the page loads with all filters/parameters etc. It is simply this initial building that immediately gets overwritten that we want to avoid because it is "redundant".

Ideally we would want to just create an entirely different loading component (like for graph), but it has been shown (see comment above the build lines in both table and grid) that having different settings for the objects causes problems (the case we observed was the pagination setting). Remember the long discussion we had where you said, just write "Do not change this line" :). So this is related to that.

@huong-li-nguyen
Copy link
Contributor

Thanks for clarifying!

@antonymilne
Copy link
Contributor

I would like to review this so please don't merge yet 🙂

Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: thanks @petar-qb and @maxschulz-COL for your great work on this. Hopefully longer term we will somehow have an entirely better system here but for now this looks good.


I prefer the solution @petar-qb suggested in https://github.com/mckinsey/vizro/pull/439/files#r1580918187 so would be great if you could try it out. But the current solution is fine too.

I am not hugely concerned right now about the performance hit this incurs, but I am not keen on it either. In general I don't much like the "double loading" we currently have where we do everything once with an empty dataframe (or the real thing, like now) and then immediately override it with the real data. This PR unfortunately, but necessarily, adds another layer of unsatisfactoriness to that scheme 😬

That isn't meant as a criticism of this PR or the existing system, since I know the current on-page-load system has its merits and lots of good reasoning behind it. I tried to come up with some improvements before and couldn't. I just hope that as part of the actions v2 work we can somehow improve this scheme though 🤞

@maxschulz-COL maxschulz-COL enabled auto-merge (squash) May 7, 2024 06:52
@maxschulz-COL maxschulz-COL merged commit 296a99e into main May 12, 2024
34 checks passed
@maxschulz-COL maxschulz-COL deleted the bug/missing_DF_435 branch May 12, 2024 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Bug Report 🐛 Issue/PR that report/fix a bug Status: Ready for Review ☑️ Issue/PR is ready for review - all tests have passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Custom AG Grid function overwrites pandas.DataFrame provided as input with an empty pandas.DataFrame
4 participants