Extend CliMulti to support using symfony/process #22513

mneudert · 2024-08-18T12:34:57Z

Relevant references

With #22394 we plan to use a database based approach to not archive a segment/period more than once. This will introduce a problem with long running archiving processes and server restarts/deployments, because a forced stop of an archiving process can no longer be "rescued" by a higher period archiving process. The stopped invalidation will prevent any further archiving until it is detected as broken and retried (by default after 24 hours).

With #22487 we plan to introduce signal handling into the archiving process to prevent mentioned problems. The intent is to support "stop the current archive after receiving SIGINT" and "abort the current archive after receiving SIGTERM, reset that invalidation so it does not stall archiving".

CliMulti and OS signal handling

The CliMulti implementation received a performance improvement in #18157, using a synchronous shell_exec instead of the concurrent background process implementation, if only a single request is done.

A single shell_exec can not be intercepted with the features ext-pcntl provides, and with archives being able to take hours to complete, we can not provide our planned SIGTERM handling without changes. Having a server wait for hours to allow a clean shutdown of the archiving run would match our planned SIGINT handling, but SIGTERM should allow for an immediate shutdown. Especially when using costly cloud provider instances.

One way to solve our problem would be to revert the #18157 improvement, it would work (test implementation, CI run) for our use case.

Symfony/Process alternative

This PR provides the alternative route of integrating symfony/process to replace our custom shell_exec implementation. Symfony is using proc_open instead of shell_exec, and should have a lower footprint compared to the custom output pipe/file handling used by CliMulti.

Some noteworthy details on the PR implementation:

The VisitorGenerator tests fail because symfony/process is already integrated there. Should be taken care of if we decide to go with the approach of this PR
The UI OneClickUpdate tests have some issues (example run) with the new package and the location it is used in. It could be a quirk in the UI test setup itself, but needs further investigation to prevent problems with an actual upgrade.
The symfony package support detection in CliMulti is using a feature flag and is quite defensive. This should allow us to work around any problems in the update process, with the current implementation taking over gracefully.

Review

sgiehl

The code looks straight forward, only left a minor suggestion for improvement.
Tested running archiving locally, with and without the feature flag enabled and everything still seems to work as expected.

It's thus hard to say if using Symfony Process might introduce any performance decrease.
@tsteur Seems you implemented the improvement around sync requests back then. Can you see any possible problems with using Symfony Process here?

core/CliMulti.php

composer.json

tsteur · 2024-08-30T05:45:30Z

@tsteur Seems you implemented the improvement around sync requests back then. Can you see any possible problems with using Symfony Process here?

I'm not familiar with Symfony Process directly @sgiehl . From a performance point of view it would be just important to launch the process and wait until it's finished (not check regularly if it's finished like we do in async since this can cause a lot of CPU usage from experience). Of course also need to make sure it'll work across all sorts of systems and that we're not regressing anything there.

Does this help?

mneudert · 2024-08-30T09:01:39Z

Symfony is using proc_open, instead of shell_exec, to always send the process into the background.

So to wait for it to finish there has to be an active loop (reuses the existing logic with the slightly increasing 100ms+ sleeps), but the overhead should be far lower. Waiting will be done using proc_get_status, without reading any pid files or fetching the process list using shell_exec('awk').

How much difference there is between shell_exec and looping? We will need to check the metrics once it is running, but given the improvement we have for asynchronous climulti I would not be surprised by a net benefit overall.

Compatibility wise everything is currently wrapped behind checks for some PHP function requirements and a feature flag, so it won't be active unless we explicitly tell the system to use it.

Imo that sounds like we are good to give it a try 🤔

tsteur · 2024-08-30T10:40:29Z

Symfony is using proc_open, instead of shell_exec, to always send the process into the background.
So to wait for it to finish there has to be an active loop (reuses the existing logic with the slightly increasing 100ms+ sleeps), but the overhead should be far lower. Waiting will be done using proc_get_status, without reading any pid files or fetching the process list using shell_exec('awk').

Be good to measure indeed 👍 On product we saw a lot of overhead with an active loop. see #18157

Previously, we were executing all cli commands asynchronous. This comes with a big overhead that we need to check the list of processes constantly, need to work with pid files, write output to disk, etc. On one of our systems we noticed this can lead to an overhead of close to 40% CPU on the archiving. It's probably bit over-reported but it had definitely a big impact and it's maybe more realistically maybe 10-20%

10-20% overhead would be quite a cost.

sgiehl

As everything is hidden behind a feature flag, let's merge it, so we can give that a try on cloud.

mneudert added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Stability For issues that make Matomo more stable and reliable to run for sys admins. Technical debt Issues the will help to reduce technical debt labels Aug 18, 2024

mneudert added this to the 5.2.0 milestone Aug 18, 2024

mneudert requested a review from a team August 18, 2024 12:34

sgiehl previously approved these changes Aug 19, 2024

View reviewed changes

core/CliMulti.php Outdated Show resolved Hide resolved

mneudert dismissed sgiehl’s stale review via 8276945 August 19, 2024 18:29

mneudert force-pushed the cli-multi-symfony branch 4 times, most recently from 0685507 to 543213c Compare August 26, 2024 15:14

mneudert requested review from tsteur and sgiehl August 28, 2024 14:07

mneudert added the Needs Review PRs that need a code review label Aug 28, 2024

mneudert mentioned this pull request Aug 28, 2024

Implements system signal handling for archiving interruption / termination #22487

Open

11 tasks

sgiehl reviewed Aug 28, 2024

View reviewed changes

composer.json Show resolved Hide resolved

mneudert mentioned this pull request Aug 30, 2024

Use Symfony/Process from core matomo-org/plugin-VisitorGenerator#151

Merged

11 tasks

sgiehl previously approved these changes Sep 2, 2024

View reviewed changes

mneudert added 8 commits September 2, 2024 16:41

Install symfony/process

9ec8bec

Define feature flag

7afe3e8

Detect support of CliMulti to use Symfony Process

7201620

Use Symfony Process if available

513f271

Handle Matomo upgrades across many versions more gracefully

33e87d9

Log process method used in CliMulti

86ede5c

Add proper wrapper around Symfony/Process dependency

8523161

Update plugins/VisitorGenerator submodule

9e49bbc

mneudert dismissed sgiehl’s stale review via 9e49bbc September 2, 2024 14:41

mneudert force-pushed the cli-multi-symfony branch from e467182 to 9e49bbc Compare September 2, 2024 14:41

sgiehl approved these changes Sep 2, 2024

View reviewed changes

sgiehl merged commit 1733ef7 into 5.x-dev Sep 2, 2024
25 checks passed

sgiehl deleted the cli-multi-symfony branch September 2, 2024 15:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend CliMulti to support using symfony/process #22513

Extend CliMulti to support using symfony/process #22513

mneudert commented Aug 18, 2024

sgiehl left a comment

tsteur commented Aug 30, 2024

mneudert commented Aug 30, 2024

tsteur commented Aug 30, 2024

sgiehl left a comment

Extend CliMulti to support using symfony/process #22513

Extend CliMulti to support using symfony/process #22513

Conversation

mneudert commented Aug 18, 2024

Relevant references

CliMulti and OS signal handling

Symfony/Process alternative

Review

sgiehl left a comment

Choose a reason for hiding this comment

tsteur commented Aug 30, 2024

mneudert commented Aug 30, 2024

tsteur commented Aug 30, 2024

sgiehl left a comment

Choose a reason for hiding this comment