-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Large Archive cronjob fails with memory exhausted #22875
Comments
I have the same issue on a smaller scale with 26 websites. Cronjob with PHP 8.0 with memory limit 512 MB was exhausted. Increased the memory limit for the cronjob to 1024 MB, still not enough. |
+1 (60 sites, 256MB, PHP83, cron job every 10 minutes, running for 1 year)
... but i get different error message from my cron job since 5.2.0 update:
Does anyone know how I can "stop" the running archiver? When I execute the same command via command line, there is no error message:
EDIT:
|
We can see both increased RAM and CPU usage since Matomo 5.2.0 update. |
The $ ./console help core:archive
[...]
--max-websites-to-process=MAX-WEBSITES-TO-PROCESS Maximum number of websites to process during a single execution of the archiver. Can be used to limit the process lifetime e.g. to avoid increasing memory usage.
--max-archives-to-process=MAX-ARCHIVES-TO-PROCESS Maximum number of archives to process during a single execution of the archiver. Can be used to limit the process lifetime e.g. to avoid increasing memory usage.
[...] However, I would prefer a less resource intense and more robust solution. |
Hi @mcguffin. Thank you for creating the issue and bringing this to our attention, that's very appreciated. We have reviewed and triaged the problem internally, and we have confirmed it is an issue. Our team will prioritise this, and we will update you on the progress here when we have an update to share. If you have any further information or questions, please feel free to add them here. |
Hi everyone, It looks like the issues with archiving might be related to the new hits metric introduced in Matomo 5.2.0. The update created invalidations to backfill this metric for all reports for the current year. These invalidations can expose a long-standing memory issue during the archiving process. Archiving is trying to prevent running for overlapping periods. For example, if any day within a week hasn’t been archived yet, that week won't be processed:
This "skipping invalidation" causes the memory usage of core:archive to grow until the archiving process completes or runs out of memory. Eventually, all pending invalidations should be processed and the system should return to a stable state. However, there are a couple of workarounds you can try to avoid waiting for the issue to resolve on its own. Workaround 1: Limit the Number of Reports ProcessedOne approach to manage memory growth, as @mcguffin mentioned, is to limit the number of reports being processed in a single run. You can do this by using the --max-archives-to-process option, for example:
You can then rerun archiving until all pending invalidations are processed. This helps reduce the number of skipped invalidations and may allow archiving to complete without running out of memory. To reduce the number of skipped invalidations t can also help to lower the amount of concurrent requests being sent by the archiver:
Archiving only 1 archive in parallel, instead of 3 by default, should also reduce the amount of skipped invalidations. While lowering the overall memory usage during archiving, this can lead to longer archiving times, depending on the numbers of segments you have configured or the number of dates being archived. Workaround 2: Remove Pending InvalidationsAnother option is to manually remove the invalidations related to the new hits metric. You can use the following SQL query to identify the pending invalidations: > SELECT * FROM matomo_archive_invalidations WHERE report = 'Actions_hits' AND ts_started IS NULL AND status = 0;
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+
| idinvalidation | idarchive | name | idsite | date1 | date2 | period | ts_invalidated | ts_started | status | report |
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+
| 4558 | NULL | done.Actions | 1 | 2024-12-01 | 2024-12-01 | 1 | 2024-12-20 15:47:36 | NULL | 0 | Actions_hits |
| 4559 | NULL | done.Actions | 1 | 2024-12-01 | 2024-12-31 | 3 | 2024-12-20 15:47:36 | NULL | 0 | Actions_hits |
| 4560 | NULL | done.Actions | 1 | 2024-12-02 | 2024-12-02 | 1 | 2024-12-20 15:47:36 | NULL | 0 | Actions_hits |
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+ Then, you can delete the pending invalidations using the following query:
This will remove all pending invalidations for the hits metric, and memory usage during archiving should return to normal. Afterward, you can manually invalidate the specific date ranges you want the hits metric to be archived for:
|
Does anyone know what I can do to get rid of this problem: ?
|
@pfpro are there any archivers running? If not it might be the case that some jobs were aborted and are still considered as running. |
@sgiehl: see above |
I think this is not solved, right? Should the milstone tag instead be 5.2.2? @ronak-innocraft |
Or is the solution https://github.com/matomo-org/matomo/pull/22892/files ? |
BTW - after upgrade to 5.2.1 we are still see high RAM and CPU usage as we did with 5.2.0. |
Our Matomo instance is having the same issue that suddenly our cronjobs don't work anymore because of:
Unfortunately, this does not resolve automatically unless we are actually stopping the cron job to wait until the running archivers are marked as stopped. Top shows no running archiver processes. Invalidations are empty. So something seems to be stacking up and blocking each other. Before 5.2.0 everything worked fine. If I execute the jobs manually on the shell it works. |
"Glad" to hear that I am not alone ;) |
What happened?
./console core:archive --no-interaction --no-ansi --disable-scheduled-tasks
Allowed memory size of xxx bytes exhausted
since 5.2.0 Update.What should happen?
a smoothly running cronjob that doesn't need my attention
How can this be reproduced?
php -d memory_limit=64M
.Obviously some data is piling up in memory during the archive process.
Matomo version
5.2.0
PHP version
8.1.31
Server operating system
ubuntu 22.04
What browsers are you seeing the problem on?
Not applicable (e.g. an API call etc.)
Computer operating system
Relevant log output
Validations
The text was updated successfully, but these errors were encountered: