Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats #50652

Open
adhorodyski opened this issue Oct 11, 2024 · 49 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production External Added to denote the issue can be worked on by a contributor NewFeature Something to build that is a new item. Reviewing Has a PR in review Weekly KSv2

Comments

@adhorodyski
Copy link
Contributor

adhorodyski commented Oct 11, 2024

Coming from the discussion thread under https://expensify.slack.com/archives/C05LX9D6E07/p1728314452317409

Problem

Lack of solid performance benchmarks across the core app workflows, baselines for proposing performance improvements.

Solution

  • Focus on 1 scenario of opening a chat
  • Benchmark it in detail to get the granular results (also for each phase) so we have solid numbers for whom specifically it's slow and why. There are also analytics we can use for this to have more datapoints.
  • Pinpoint the areas for improvement we see
  • Map how the proposal from initial thread/nitro modules/anything other can fit to improve it

cc @hannojg

Issue OwnerCurrent Issue Owner: @rayane-djouah
Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~021873115691475801087
  • Upwork Job ID: 1873115691475801087
  • Last Price Increase: 2024-12-28
@adhorodyski
Copy link
Contributor Author

@mountiny could you please assign this one to me & @hannojg? Thanks!

@hannojg
Copy link
Contributor

hannojg commented Oct 11, 2024

Thanks! 👋

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Oct 16, 2024

Update
Our bottom line is to get super solid on what precisely is slow (have granular measurements) and for whom specifically.

Who
To be able to say how many people something affects (in different percentiles), we need to have something to segment by. We'll extend our performance analytics' metadata (in all envs, including e2e tests) to include the below fields. These will help us group measurements by the types of accounts. (PR: #50881)

  • number of reports ✅
  • number of reportActions 🔄
  • number of personalDetails ✅
  • number of policies 🔄
  • number of transactions 🔄
  • number of transactionViolations 🔄

What
To be able to pinpoint what exactly is slow, we'll implement granular metrics in code, across both App & Onyx. Here's a breakdown of all the phases we decided to focus on.

  • APPLYING_UPDATES (coming over https, until we schedule notifying subscribers)
    • DATABASE_WRITE (actual save operations like multiset and multimerge)
  • NOTIFYING_SUBSCRIBERS (multiple key(s)Changed triggered by scheduleNotifyCollectionSubscribers)
    • DATABASE_READ
    • REACT_PROVIDER - (useOnyx/withOnyx)
  • ONYX_DATA_POSTPROCESSING (from receiving a new update (subscriber got called) until we have the final data)

How
TBC

@adhorodyski
Copy link
Contributor Author

cc @mountiny we'd like to ask you for any feedback about this 1st batch (please invite others!). Is there an area we're missing? I want to make sure we focus on the right parts when profiling this, the exact tools (like e2e suites) will come next once we fully agree on the who&what.

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Monthly KSv2 Weekly KSv2 labels Oct 21, 2024
@melvin-bot melvin-bot bot changed the title Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats Oct 22, 2024
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Oct 22, 2024
Copy link

melvin-bot bot commented Oct 22, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Oct 22, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 9.0.51-4 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-10-29. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Oct 22, 2024

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

@adhorodyski
Copy link
Contributor Author

adhorodyski commented Oct 22, 2024

The extended analytics setup is merged now:

  • to be confirmed it's available in Firebase for newly reported events
  • to be merged into local-based profiling tools (on top of react-native-performance within the Performance module?) so we can use the same parameters in eg. e2e workflows cc @hannojg, I think this is the last thing we have to do when it comes to the 'segmentation' part

@adhorodyski
Copy link
Contributor Author

  • Next part is to implement the granular metrics as listed above

also cc @hannojg, do you think these should also be reported to Firebase OR we just focus on being able to profile this behind some flag?

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Weekly KSv2 labels Oct 23, 2024
@melvin-bot melvin-bot bot changed the title [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-10-30] [HOLD for payment 2024-10-29] Benchmark performance bottlenecks when opening chats Oct 23, 2024

This comment was marked as off-topic.

@mountiny mountiny removed the Awaiting Payment Auto-added when associated PR is deployed to production label Oct 24, 2024
@melvin-bot melvin-bot bot added Weekly KSv2 and removed Daily KSv2 labels Dec 18, 2024
@melvin-bot melvin-bot bot changed the title Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats Dec 18, 2024
Copy link

melvin-bot bot commented Dec 18, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 9.0.76-12 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-12-25. 🎊

For reference, here are some details about the assignees on this issue:

@adhorodyski
Copy link
Contributor Author

Opened a draft PR and I will proceed with more testing so this can be reviewed from Monday.
I just have to make sure it works as expected (if 'measures' need to be filtered in some cases), but it looks stable. I'll try analyzing this data at the same time.

@adhorodyski
Copy link
Contributor Author

Tested this today, here's an update:

[{"detail": undefined, "duration": 15.607542037963867, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968390627.430583}, {"detail": undefined, "duration": 15.607542037963867, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968390627.430583}, {"detail": undefined, "duration": 15.607542037963867, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968390627.430583}, {"detail": undefined, "duration": 259.7711670398712, "entryType": "measure", "name": "sidebar_loaded", "startTime": 968390619.279541}, {"detail": undefined, "duration": 14.05049991607666, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968390969.205625}, {"detail": undefined, "duration": 0.24849998950958252, "entryType": "measure", "name": "apply_https_updates", "startTime": 968391216.804041}, {"detail": undefined, "duration": 0.3620830774307251, "entryType": "measure", "name": "apply_https_updates", "startTime": 968391539.429333}, {"detail": undefined, "duration": 0.11170804500579834, "entryType": "measure", "name": "apply_https_updates", "startTime": [968392154.4035](tel:9683921544035)}, {"detail": undefined, "duration": 18.054041028022766, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968393363.46525}, {"detail": undefined, "duration": 0.1734999418258667, "entryType": "measure", "name": "apply_https_updates", "startTime": 968393604.243}, {"detail": [Object], "duration": 69.57566821575165, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968393895.232208}, {"detail": [Object], "duration": 0.44399988651275635, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968394047.808}, {"detail": [Object], "duration": 1.517503023147583, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968394082.572625}, {"detail": undefined, "duration": 0.1409590244293213, "entryType": "measure", "name": "apply_https_updates", "startTime": 968393757.601416}, {"detail": undefined, "duration": 16.013083934783936, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968394016.605291}, {"detail": undefined, "duration": 1403.4333330392838, "entryType": "measure", "name": "open_report", "startTime": 968392672.306208}, {"detail": [Object], "duration": 15.28348159790039, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968394166.443416}, {"detail": [Object], "duration": 0.01708197593688965, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968394288.094958}, {"detail": undefined, "duration": 0.15479207038879395, "entryType": "measure", "name": "apply_https_updates", "startTime": 968394537.854291}, {"detail": [Object], "duration": 0.009166121482849121, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968395254.338}, {"detail": [Object], "duration": 21.432971477508545, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968396990.421416}, {"detail": [Object], "duration": 3.0215084552764893, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397078.794375}, {"detail": [Object], "duration": 0.4611673355102539, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397110.418458}, {"detail": undefined, "duration": 14.88450002670288, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968397057.165666}, {"detail": undefined, "duration": 14.88450002670288, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968397057.165666}, {"detail": undefined, "duration": 217.44841706752777, "entryType": "measure", "name": "send_message", "startTime": 968396858.680833}, {"detail": [Object], "duration": 0.01737511157989502, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397141.732083}, {"detail": [Object], "duration": 1.3695030212402344, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397200.349}, {"detail": [Object], "duration": 17.675146460533142, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397319.209125}, {"detail": [Object], "duration": 0.08849763870239258, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": [968397403.1635](tel:9683974031635)}, {"detail": undefined, "duration": 0.18762493133544922, "entryType": "measure", "name": "apply_https_updates", "startTime": 968397233.586125}, {"detail": undefined, "duration": 15.384583950042725, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968397382.722291}, {"detail": undefined, "duration": 15.384583950042725, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968397382.722291}, {"detail": [Object], "duration": 0.015044093132019043, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397442.718333}, {"detail": [Object], "duration": 0.01949894428253174, "entryType": "measure", "name": "<ReportActionsView> rendering", "startTime": 968397538.173208}, {"detail": undefined, "duration": 16.266417026519775, "entryType": "measure", "name": "get_ordered_report_ids", "startTime": 968398106.290208}, {"detail": undefined, "duration": 16.216791033744812, "entryType": "measure", "name": "apply_https_updates", "startTime": 968399287.737375}, {"detail": undefined, "duration": 0.16920804977416992, "entryType": "measure", "name": "apply_https_updates", "startTime": 968399688.226458}]

^ is a sample output we get from listing all the local measures. The format is really nice for LLMs to process, so we easily get such a summary:

Here are the top 5 longest operations from the performance logs, sorted by duration (in milliseconds):

open_report: 1,403.43 ms
sidebar_loaded: 259.77 ms
send_message: 217.45 ms
apply_https_updates: 16.22 ms (longest instance)
<ReportActionsView> rendering: 69.58 ms (longest instance)

My one problem I observed today though is that we only collect these behind an environment variable flag of CAPTURE_METRICS, so it won't have an effect in production and only produce a [] (my bad I tested this against a local override).

I thought about turning the markers on together with the BaseProfilingToolMenu available under Use profiling cc @hannojg.

@adhorodyski
Copy link
Contributor Author

Working on integrating measures into

function BaseProfilingToolMenu({isProfilingInProgress = false, showShareButton = false, pathToBeUsed, displayPath}: BaseProfilingToolMenuProps) {
the same way the Memoize is integrated so we can just turn this on/off.

Copy link

melvin-bot bot commented Dec 25, 2024

Skipping the payment summary for this issue since all the assignees are employees or vendors. If this is incorrect, please manually add the payment summary SO.

@adhorodyski
Copy link
Contributor Author

Still working on the linked draft PR, reworking the Performance module so it's available during runtime. I'm also adjusting it to fit our style guide.

@adhorodyski
Copy link
Contributor Author

Updated the PR, testing this now.

@mountiny mountiny added External Added to denote the issue can be worked on by a contributor NewFeature Something to build that is a new item. labels Dec 28, 2024
@melvin-bot melvin-bot bot changed the title [HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats [$250] [HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats Dec 28, 2024
Copy link

melvin-bot bot commented Dec 28, 2024

Job added to Upwork: https://www.upwork.com/jobs/~021873115691475801087

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Dec 28, 2024
Copy link

melvin-bot bot commented Dec 28, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @rayane-djouah (External)

@melvin-bot melvin-bot bot added the Daily KSv2 label Dec 28, 2024
@mountiny mountiny changed the title [$250] [HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats [HOLD for payment 2024-12-25] Benchmark performance bottlenecks when opening chats Dec 28, 2024
@melvin-bot melvin-bot bot removed the Weekly KSv2 label Dec 28, 2024
Copy link

melvin-bot bot commented Dec 28, 2024

Triggered auto assignment to @laurenreidexpensify (NewFeature), see https://stackoverflowteams.com/c/expensify/questions/14418#:~:text=BugZero%20process%20steps%20for%20feature%20requests for more details. Please add this Feature request to a GH project, as outlined in the SO.

@mountiny mountiny removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Dec 28, 2024
@melvin-bot melvin-bot bot added Weekly KSv2 Reviewing Has a PR in review and removed Daily KSv2 Weekly KSv2 labels Dec 28, 2024
@adhorodyski
Copy link
Contributor Author

@mountiny the PR got merged but I think we're blocked on evaluating this further given the profiling feature itself is currently broken. Everything is in place though, we're not collecting local metrics to the AppInfo. Should we close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production External Added to denote the issue can be worked on by a contributor NewFeature Something to build that is a new item. Reviewing Has a PR in review Weekly KSv2
Projects
Development

No branches or pull requests

6 participants