Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2024-10-30] [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats #50652

Open
adhorodyski opened this issue Oct 11, 2024 · 24 comments
Assignees
Labels
Reviewing Has a PR in review Weekly KSv2

Comments

@adhorodyski
Copy link
Contributor

adhorodyski commented Oct 11, 2024

Coming from the discussion thread under https://expensify.slack.com/archives/C05LX9D6E07/p1728314452317409

Problem

Lack of solid performance benchmarks across the core app workflows, baselines for proposing performance improvements.

Solution

  • Focus on 1 scenario of opening a chat
  • Benchmark it in detail to get the granular results (also for each phase) so we have solid numbers for whom specifically it's slow and why. There are also analytics we can use for this to have more datapoints.
  • Pinpoint the areas for improvement we see
  • Map how the proposal from initial thread/nitro modules/anything other can fit to improve it

cc @hannojg

@adhorodyski
Copy link
Contributor Author

@mountiny could you please assign this one to me & @hannojg? Thanks!

@hannojg
Copy link
Contributor

hannojg commented Oct 11, 2024

Thanks! 👋

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Oct 16, 2024

Update
Our bottom line is to get super solid on what precisely is slow (have granular measurements) and for whom specifically.

Who
To be able to say how many people something affects (in different percentiles), we need to have something to segment by. We'll extend our performance analytics' metadata (in all envs, including e2e tests) to include the below fields. These will help us group measurements by the types of accounts. (PR: #50881)

  • number of reports ✅
  • number of reportActions 🔄
  • number of personalDetails ✅
  • number of policies 🔄
  • number of transactions 🔄
  • number of transactionViolations 🔄

What
To be able to pinpoint what exactly is slow, we'll implement granular metrics in code, across both App & Onyx. Here's a breakdown of all the phases we decided to focus on.

  • APPLYING_UPDATES (coming over https, until we schedule notifying subscribers)
    • DATABASE_WRITE (actual save operations like multiset and multimerge)
  • NOTIFYING_SUBSCRIBERS (multiple key(s)Changed triggered by scheduleNotifyCollectionSubscribers)
    • DATABASE_READ
    • REACT_PROVIDER - (useOnyx/withOnyx)
  • ONYX_DATA_POSTPROCESSING (from receiving a new update (subscriber got called) until we have the final data)

How
TBC

@adhorodyski
Copy link
Contributor Author

cc @mountiny we'd like to ask you for any feedback about this 1st batch (please invite others!). Is there an area we're missing? I want to make sure we focus on the right parts when profiling this, the exact tools (like e2e suites) will come next once we fully agree on the who&what.

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Monthly KSv2 Weekly KSv2 labels Oct 21, 2024
@melvin-bot melvin-bot bot changed the title Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats Oct 22, 2024
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Oct 22, 2024
Copy link

melvin-bot bot commented Oct 22, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Oct 22, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 9.0.51-4 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-10-29. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Oct 22, 2024

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Oct 22, 2024

The extended analytics setup is merged now:

  • to be confirmed it's available in Firebase for newly reported events
  • to be merged into local-based profiling tools (on top of react-native-performance within the Performance module?) so we can use the same parameters in eg. e2e workflows cc @hannojg, I think this is the last thing we have to do when it comes to the 'segmentation' part

@adhorodyski
Copy link
Contributor Author

  • Next part is to implement the granular metrics as listed above

also cc @hannojg, do you think these should also be reported to Firebase OR we just focus on being able to profile this behind some flag?

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Weekly KSv2 labels Oct 23, 2024
@melvin-bot melvin-bot bot changed the title [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-10-30] [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats Oct 23, 2024

This comment was marked as off-topic.

@mountiny mountiny removed the Awaiting Payment Auto-added when associated PR is deployed to production label Oct 24, 2024
@adhorodyski
Copy link
Contributor Author

I'm back to this issue after some inactivity from my end, will post an update tomorrow :)

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Oct 30, 2024

For starters I'm investigating why updated metadata fields are not visible in Firebase. They're in code here for 2 weeks now.

cc @mountiny what is the last tag available in Hybrid, maybe it's a too short time window? #50881 (comment)

@adhorodyski
Copy link
Contributor Author

Besides above I'm working on putting code markers in the correct places now.

@mountiny
Copy link
Contributor

@adhorodyski are you getting staging data there too? latest staging is 9.0.55-9

Copy link

melvin-bot bot commented Oct 30, 2024

Skipping the payment summary for this issue since all the assignees are employees or vendors. If this is incorrect, please manually add the payment summary SO.

@hannojg
Copy link
Contributor

hannojg commented Nov 4, 2024

Sorry, this completely slipped through the cracks!

do you think these [the granular metrics as listed above] should also be reported to Firebase OR we just focus on being able to profile this behind some flag?

I think behind some flags so we can test those locally. I think it might be a bit too much to track all those events and send them to analytics. Lets get those implemented! How can i help on those?

@adhorodyski
Copy link
Contributor Author

I'm starting with the more high-level marks here, but for this very benchmark we need Performance.markStart/End() calls as described above so we can collect them to memory. Can you possibly pick even a few of them @hannojg?

@hannojg
Copy link
Contributor

hannojg commented Nov 6, 2024

Yes, will check! This will also involve making a few changes in onyx to enable these kind of measurements!

@hannojg
Copy link
Contributor

hannojg commented Nov 18, 2024

Working on a PR for onyx over here:

expecting to wrap it up by tomorrow!

With these changes we should be able to record all steps of what onyx does using the performance API / react-native-performance. We can then use the same API in the app to create reports for different actions performed.

@adhorodyski
Copy link
Contributor Author

Reviewing now!

@adhorodyski
Copy link
Contributor Author

Approved the PR to Onyx. I'll start adding the other metrics' calls (app-specific) into the app next week.

@adhorodyski
Copy link
Contributor Author

#53278 here's another PR for the applying_updates stage.

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Nov 29, 2024

Here's how we're currently looking with 2 items left. Not sure how we should wrap these though yet.

  • APPLYING_UPDATES 🛠️ PR by me
  • DATABASE_WRITE 🛠️ PR by Hanno
  • NOTIFYING_SUBSCRIBERS 🛠️ PR by Hanno
  • DATABASE_READ 🛠️ PR by Hanno
  • REACT_PROVIDER ⏭️ to be decorated in Onyx (?)
  • ONYX_DATA_POSTPROCESSING ⏭️

I'm also thinking on how we can visualize this data. I can see this being exportable as any other trace (it's an array of measurements) so we can maybe build a Gant chart out of this.

@mountiny mountiny self-assigned this Nov 30, 2024
@adhorodyski
Copy link
Contributor Author

After some considerations around ONYX_DATA_POSTPROCESSING I think it'd be the most precise (and easy to follow) to actually wrap all specific computations separately. I'll look for places we know take a bit, wrap them with markStart/markEnd so we are certain which specific parts are slow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reviewing Has a PR in review Weekly KSv2
Projects
Development

No branches or pull requests

3 participants