-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix that pgbackrest sometimes stops operating in prod cluster (at least add alerts!) #331
Comments
Update After restoring the database (open terminal in stuck db pod -> scp contents to other server -> launch same version of postgres with the pgdata directory from scp transfer -> pgdump from that temp instance -> clear PVC in prod cluster, and import from pgdump), the pgbackrest backups started working again. (first new backup on June 26th) On July 25th though, the database pod got its PVC to 100% storage usage again, causing the issue again. I checked the pgbackrest backups at this point, and the last successful one had been on July 20th. In summary: Pgbackrest config might actually be fine; but there is something causing the backups to fail at some point. (and no alerting in place when that happens! could detect by checking the "Conditions" column of the Kubernetes Jobs in postgres-operator namespace) |
No description provided.
The text was updated successfully, but these errors were encountered: