Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Large Archive cronjob fails with memory exhausted #22875

Open
4 tasks done
mcguffin opened this issue Dec 17, 2024 · 14 comments
Open
4 tasks done

[Bug] Large Archive cronjob fails with memory exhausted #22875

mcguffin opened this issue Dec 17, 2024 · 14 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. triaged
Milestone

Comments

@mcguffin
Copy link

What happened?

  • Large Matomo on premise with 700+ sites.
  • Cronjob every 3 hours for ./console core:archive --no-interaction --no-ansi --disable-scheduled-tasks
  • Fails with Allowed memory size of xxx bytes exhausted since 5.2.0 Update.

What should happen?

a smoothly running cronjob that doesn't need my attention

How can this be reproduced?

  • A lot of visits to process. Sadly I don't know how to achieve this under lab conditions.
  • Simulate low memory with php -d memory_limit=64M.
  • With 64M memory was exhausted after roughly 700 archives.
$ php -d memory_limit=64M ./console core:archive --no-interaction --no-ansi --disable-scheduled-tasks

Obviously some data is piling up in memory during the archive process.

Matomo version

5.2.0

PHP version

8.1.31

Server operating system

ubuntu 22.04

What browsers are you seeing the problem on?

Not applicable (e.g. an API call etc.)

Computer operating system

Relevant log output

$ php -d memory_limit=256M <REDACTED>/console --matomo-domain=<REDACTED> core:archive --no-interaction --no-ansi --disable-scheduled-tasks
INFO      [2024-12-16 21:02:51] 3254813  ---------------------------
INFO      [2024-12-16 21:02:51] 3254813  INIT
INFO      [2024-12-16 21:02:51] 3254813  Running Matomo 5.2.0 as Super User
INFO      [2024-12-16 21:02:51] 3254813  ---------------------------
INFO      [2024-12-16 21:02:51] 3254813  NOTES
INFO      [2024-12-16 21:02:51] 3254813  - Async process archiving supported, using CliMulti.
INFO      [2024-12-16 21:02:51] 3254813  - Reports for today will be processed at most every 1800 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO      [2024-12-16 21:02:51] 3254813  - Archiving was last executed without error 7 hours 43 min ago.
INFO      [2024-12-16 21:02:51] 3254813  - Continuing ongoing archiving run by pulling from shared idSite queue.
INFO      [2024-12-16 21:02:51] 3254813  ---------------------------
INFO      [2024-12-16 21:02:51] 3254813  START
INFO      [2024-12-16 21:02:51] 3254813  Starting Matomo reports archiving...
INFO      [2024-12-16 21:02:51] 3254813  1 out of 3 archivers running currently
INFO      [2024-12-16 21:02:51] 3254813  Start processing archives for site 2723.
INFO      [2024-12-16 21:02:51] 3254813    Will invalidate archived reports for today in site ID = 2723's timezone (2024-12-16 00:00:00).
INFO      [2024-12-16 21:02:51] 3254813    Will invalidate archived reports for yesterday in site ID = 2723's timezone (2024-12-15 00:00:00).
INFO      [2024-12-16 21:02:53] 3254813  Archived website id 2723, period = day, date = 2024-12-16, segment = '', 155 visits found. Time elapsed: 1.047s
[...]
INFO      [2024-12-16 21:10:04] 3254813  Archived website id 2835, period = day, date = 2024-11-27, segment = '', plugin = Actions, report = Actions_hits, 4 visits found. Time elapsed: 0.267s
INFO      [2024-12-16 21:10:04] 3254813  Archived website id 2835, period = day, date = 2024-11-26, segment = '', plugin = Actions, report = Actions_hits, 1 visits found. Time elapsed: 0.253s
INFO      [2024-12-16 21:10:04] 3254813  Archived website id 2835, period = day, date = 2024-11-25, segment = '', plugin = Actions, report = Actions_hits, 1 visits found. Time elapsed: 0.253s
INFO      [2024-12-16 21:10:04] 3254813  Archived website id 2835, period = day, date = 2024-11-24, segment = '', plugin = Actions, report = Actions_hits, 3 visits found. Time elapsed: 0.253s
PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 28672 bytes) in <REDACTED>/core/DataAccess/Model.php on line 830
PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 32768 bytes) in <REDACTED>/core/DataTable/Manager.php on line 1

///////// different approach /////////

$ php -d memory_limit=64M <REDACTED>/console core:archive --no-interaction --no-ansi --disable-scheduled-tasks
INFO      [2024-12-17 11:13:42] 4178648  ---------------------------
INFO      [2024-12-17 11:13:42] 4178648  INIT
INFO      [2024-12-17 11:13:42] 4178648  Running Matomo 5.2.0 as Super User
[...]
INFO      [2024-12-17 11:21:20] 4178648  Archived website id 2972, period = day, date = 2024-12-17, segment = '', 1 visits found. Time elapsed: 0.287s
INFO      [2024-12-17 11:21:20] 4178648  Archived website id 2972, period = week, date = 2024-12-16, segment = '', 3 visits found. Time elapsed: 0.335s
PHP Fatal error:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 40960 bytes) in <REDACTED>/core/CronArchive/QueueConsumer.php on line 368
ERROR     [2024-12-17 11:21:20] 4178648  Fatal error encountered: <REDACTED>/core/CronArchive/QueueConsumer.php(368): Allowed memory size of 67108864 bytes exhausted (tried to allocate 40960 bytes) [Query: , CLI mode: 1]
PHP Fatal error:  Allowed memory size of 67108864 bytes exhausted (tried to allocate 32768 bytes) in <REDACTED>/core/Cookie.php on line 419

Validations

@mcguffin mcguffin added Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. To Triage An issue awaiting triage by a Matomo core team member labels Dec 17, 2024
@freinbichler
Copy link

I have the same issue on a smaller scale with 26 websites. Cronjob with PHP 8.0 with memory limit 512 MB was exhausted. Increased the memory limit for the cronjob to 1024 MB, still not enough.
Now with 2048 MB it worked. Before updating to Matomo version 5.2 it never failed even with just 512 MB.

@pfpro
Copy link

pfpro commented Dec 17, 2024

+1 (60 sites, 256MB, PHP83, cron job every 10 minutes, running for 1 year)

/usr/bin/php83 /usr/www/users/pfpro/intern/matomo/console core:archive --url=https://matomo.xxx.de/

... but i get different error message from my cron job since 5.2.0 update:

INFO      [2024-12-17 15:40:01] 70293  INIT
INFO      [2024-12-17 15:40:01] 70293  Running Matomo 5.2.0 as Super User
INFO      [2024-12-17 15:40:01] 70293  ---------------------------
INFO      [2024-12-17 15:40:01] 70293  NOTES
INFO      [2024-12-17 15:40:01] 70293  - Async process archiving supported, using CliMulti.
INFO      [2024-12-17 15:40:01] 70293  - Reports for today will be processed at most every 900 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO      [2024-12-17 15:40:01] 70293  - Archiving was last executed without error 4 min 59s ago.
INFO      [2024-12-17 15:40:01] 70293  - Continuing ongoing archiving run by pulling from shared idSite queue.
INFO      [2024-12-17 15:40:01] 70293  ---------------------------
INFO      [2024-12-17 15:40:01] 70293  START
INFO      [2024-12-17 15:40:01] 70293  Starting Matomo reports archiving...
INFO      [2024-12-17 15:40:01] 70293  Archiving will stop now because 4 archivers are already running and max 3 are supposed to run at once.
INFO      [2024-12-17 15:40:01] 70293  Reached maximum concurrent archivers allowed (3), aborting run.
INFO      [2024-12-17 15:40:01] 70293  ---------------------------
INFO      [2024-12-17 15:40:01] 70293  SCHEDULED TASKS
INFO      [2024-12-17 15:40:01] 70293  Starting Scheduled tasks... 
INFO      [2024-12-17 15:40:02] 70293  done

Does anyone know how I can "stop" the running archiver?

When I execute the same command via command line, there is no error message:

INFO      [2024-12-17 16:07:20] 72603  INIT
INFO      [2024-12-17 16:07:20] 72603  Running Matomo 5.2.0 as Super User
INFO      [2024-12-17 16:07:20] 72603  ---------------------------
INFO      [2024-12-17 16:07:20] 72603  NOTES
INFO      [2024-12-17 16:07:20] 72603  - Async process archiving supported, using CliMulti.
INFO      [2024-12-17 16:07:20] 72603  - Reports for today will be processed at most every 900 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO      [2024-12-17 16:07:20] 72603  - Archiving was last executed without error 2 min 18s ago.
INFO      [2024-12-17 16:07:20] 72603  - Continuing ongoing archiving run by pulling from shared idSite queue.
INFO      [2024-12-17 16:07:20] 72603  ---------------------------
INFO      [2024-12-17 16:07:20] 72603  START
INFO      [2024-12-17 16:07:20] 72603  Starting Matomo reports archiving...
INFO      [2024-12-17 16:07:20] 72603  1 out of 3 archivers running currently
INFO      [2024-12-17 16:07:20] 72603  Start processing archives for site 38.
INFO      [2024-12-17 16:07:20] 72603    Will invalidate archived reports for today in site ID = 38's timezone (2024-12-17 00:00:00).
INFO      [2024-12-17 16:07:20] 72603    Will invalidate archived reports for yesterday in site ID = 38's timezone (2024-12-16 00:00:00).
INFO      [2024-12-17 16:07:21] 72603  Archived website id 38, period = day, date = 2024-12-17, segment = '', 15 visits found. Time elapsed: 0.827s

EDIT:
When i updated to 5.2.0 i got this error in the first place:

We’re sorry, but the action you were trying to perform failed due to a security check. This is a standard measure to ensure the safety and security of your account. Please try again, and if the issue persists, please contact our support team for further assistance.

@mikkeschiren
Copy link
Contributor

We can see both increased RAM and CPU usage since Matomo 5.2.0 update.

@mcguffin
Copy link
Author

The --max-<SOMETHING>--to-process option should be able to suppress the crashes:

$ ./console help core:archive
[...]
      --max-websites-to-process=MAX-WEBSITES-TO-PROCESS                    Maximum number of websites to process during a single execution of the archiver. Can be used to limit the process lifetime e.g. to avoid increasing memory usage.
      --max-archives-to-process=MAX-ARCHIVES-TO-PROCESS                    Maximum number of archives to process during a single execution of the archiver. Can be used to limit the process lifetime e.g. to avoid increasing memory usage.
[...]

However, I would prefer a less resource intense and more robust solution.

@randy-innocraft
Copy link

Hi @mcguffin. Thank you for creating the issue and bringing this to our attention, that's very appreciated. We have reviewed and triaged the problem internally, and we have confirmed it is an issue. Our team will prioritise this, and we will update you on the progress here when we have an update to share. If you have any further information or questions, please feel free to add them here.

@randy-innocraft randy-innocraft added triaged Bug For errors / faults / flaws / inconsistencies etc. and removed To Triage An issue awaiting triage by a Matomo core team member Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Dec 17, 2024
@mneudert
Copy link
Member

Hi everyone,

It looks like the issues with archiving might be related to the new hits metric introduced in Matomo 5.2.0. The update created invalidations to backfill this metric for all reports for the current year. These invalidations can expose a long-standing memory issue during the archiving process.

Archiving is trying to prevent running for overlapping periods. For example, if any day within a week hasn’t been archived yet, that week won't be processed:

DEBUG [2024-12-20 15:43:41] 1834 Found archive with intersecting period with others in concurrent batch, skipping until next batch: [idinvalidation = 4153, idsite = 3, period = week(2024-12-16 - 2024-12-22), name = done, segment = ]

This "skipping invalidation" causes the memory usage of core:archive to grow until the archiving process completes or runs out of memory.

Eventually, all pending invalidations should be processed and the system should return to a stable state. However, there are a couple of workarounds you can try to avoid waiting for the issue to resolve on its own.

Workaround 1: Limit the Number of Reports Processed

One approach to manage memory growth, as @mcguffin mentioned, is to limit the number of reports being processed in a single run. You can do this by using the --max-archives-to-process option, for example:

./console core:archive --max-archives-to-process=500

You can then rerun archiving until all pending invalidations are processed. This helps reduce the number of skipped invalidations and may allow archiving to complete without running out of memory.

To reduce the number of skipped invalidations t can also help to lower the amount of concurrent requests being sent by the archiver:

./console core:archive --concurrent-requests-per-website=1

Archiving only 1 archive in parallel, instead of 3 by default, should also reduce the amount of skipped invalidations. While lowering the overall memory usage during archiving, this can lead to longer archiving times, depending on the numbers of segments you have configured or the number of dates being archived.

Workaround 2: Remove Pending Invalidations

Another option is to manually remove the invalidations related to the new hits metric. You can use the following SQL query to identify the pending invalidations:

> SELECT * FROM matomo_archive_invalidations WHERE report = 'Actions_hits' AND ts_started IS NULL AND status = 0;
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+
| idinvalidation | idarchive | name         | idsite | date1      | date2      | period | ts_invalidated      | ts_started | status | report       |
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+
|           4558 |      NULL | done.Actions |      1 | 2024-12-01 | 2024-12-01 |      1 | 2024-12-20 15:47:36 | NULL       |      0 | Actions_hits |
|           4559 |      NULL | done.Actions |      1 | 2024-12-01 | 2024-12-31 |      3 | 2024-12-20 15:47:36 | NULL       |      0 | Actions_hits |
|           4560 |      NULL | done.Actions |      1 | 2024-12-02 | 2024-12-02 |      1 | 2024-12-20 15:47:36 | NULL       |      0 | Actions_hits |
+----------------+-----------+--------------+--------+------------+------------+--------+---------------------+------------+--------+--------------+

Then, you can delete the pending invalidations using the following query:

> DELETE FROM matomo_archive_invalidations WHERE report = 'Actions_hits' AND ts_started IS NULL AND status = 0;

This will remove all pending invalidations for the hits metric, and memory usage during archiving should return to normal. Afterward, you can manually invalidate the specific date ranges you want the hits metric to be archived for:

# To invalidate all reports:
./console core:invalidate-report-data --dates=2024-11-01,2024-11-30

# To invalidate just the 'hits' metric:
./console core:invalidate-report-data --dates=2024-11-01,2024-11-30 --plugin=Actions.Actions_hits

@pfpro
Copy link

pfpro commented Dec 21, 2024

Does anyone know what I can do to get rid of this problem: ?

Archiving will stop now because 4 archivers are already running and max 3 are supposed to run at once.

@sgiehl
Copy link
Member

sgiehl commented Dec 21, 2024

@pfpro are there any archivers running? If not it might be the case that some jobs were aborted and are still considered as running.
If no jobs are running you could check the archive_invalidations table if any invalidations are marked as running (status=1). You can reset them to 0 and start archiving again.
Otherwise the problem might solve itself after a day, as then running invalidations will no longer be considered as such then.

@pfpro
Copy link

pfpro commented Dec 21, 2024

@sgiehl: see above
#22875 (comment)

@ronak-innocraft ronak-innocraft added this to the 5.2.1 milestone Dec 22, 2024
@mikkeschiren
Copy link
Contributor

I think this is not solved, right? Should the milstone tag instead be 5.2.2? @ronak-innocraft

@mikkeschiren
Copy link
Contributor

Or is the solution https://github.com/matomo-org/matomo/pull/22892/files ?

@mikkeschiren
Copy link
Contributor

BTW - after upgrade to 5.2.1 we are still see high RAM and CPU usage as we did with 5.2.0.

@boldtrn
Copy link

boldtrn commented Dec 30, 2024

Our Matomo instance is having the same issue that suddenly our cronjobs don't work anymore because of:

Archiving will stop now because 4 archivers are already running and max 3 are supposed to run at once.

Unfortunately, this does not resolve automatically unless we are actually stopping the cron job to wait until the running archivers are marked as stopped. Top shows no running archiver processes. Invalidations are empty.

So something seems to be stacking up and blocking each other.

Before 5.2.0 everything worked fine.

If I execute the jobs manually on the shell it works.

@pfpro
Copy link

pfpro commented Dec 30, 2024

"Glad" to hear that I am not alone ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. triaged
Projects
None yet
Development

No branches or pull requests

9 participants