Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a data retention cron #42

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Add a data retention cron #42

wants to merge 7 commits into from

Conversation

andrew-kb
Copy link
Contributor

This provides a simple, data driven interface to purging old data from tables.

I've moved exception log and worker job clean-up into retention jobs, which in the former case avoids a potentially lengthy operation (delete on MyIASM tables grows increasingly slow the more operations performed sans OPTIMIZE TABLE) occurring during a request.

Added new jobs for rate_limit_hits and login_attempts because we weren't cleaning these up.

@andrew-kb andrew-kb self-assigned this Aug 28, 2018
@andrew-kb andrew-kb requested a review from TheJosh August 28, 2018 04:56

// Ensure they haven't provided a negative interval as this will cause the
// threshold to move forward in time.
if ($job_spec['min_age']->invert) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of complaining, could we just toggle-off the invert flag?

@@ -59,6 +59,20 @@
Register::cronJob('daily', 'Sprout\\Controllers\\Admin\\FileAdminController', 'cronCleanupInvalid');
Register::cronJob('daily', 'Sprout\\Controllers\\ContentSubscribeController', 'cronSendSubscriptions');
Register::cronJob('daily', 'Sprout\\Controllers\\Admin\\ActionLogAdminController', 'cronCleanup');
Register::cronJob('daily', 'Sprout\\Controllers\\RetentionCronController', 'cronRetention');

// Purge exception log entries after 14 days
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments don't really add anything here, except maybe the fact that ISO 8601 period specs are a little opaque.

Of course we could switch away from ISO 8601 period specs and instead to strtotime relative time strings (e.g. '14 days')


$threshold = clone $now;
$threshold->sub($job_spec['min_age']);
$threshold = $threshold->format('Y-m-d H:i:s');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the comparison be made on the date portion only?

Cron::message("Purging records from '{$job_spec['table']}' ({$job_spec['column']}) updated before {$threshold}");

$conds = [
[$job_spec['column'], '<', $threshold]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an index on the column, will it be hit?


$q = "DELETE FROM ~{$job_spec['table']} WHERE {$where}";
$count = Pdb::q($q, $params, 'count');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we run OPTIMIZE TABLE on MyISAM tables (either automatically or when set via an option flag)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants