Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure that the event snapshot is available when one observer defers and the other does not #1562

Merged

Conversation

tonyandrewmeyer
Copy link
Contributor

If an event has multiple observers, then if any of the (event+observer) pairs do not defer, then we need to save the snapshot so that it's available to process that (event+observer).

#1372 introduced a bug where if any of the (event+observer) pairs was skipped because an exact match was already in the deferral queue, we would skip saving the snapshot. This is ok if there's only a single observer, or if an (event+observer) that did not defer was processed before any that did defer, but not otherwise.

Fixes #1561

It's probably less of a shock to have them both, in case charms are relying on the event being last, and making the reemit aware of the current event or emit aware of what reemit did are both complex changes and it's not clear if it's really worth it. In many cases the deferred event and current event won't be the same anyway, so it won't matter.
@tonyandrewmeyer
Copy link
Contributor Author

I'm attempting to reproduce the failure (and then the fix) in pgbouncer-operator and postgresql-k8s-operator, although I've had limited success in the past trying to run the integration tests of complex charms outside of the regular repo.

@tonyandrewmeyer
Copy link
Contributor Author

postgresql-k8s-operator failure in my fork. Trying to get it to build with this branch to see the fix.

pgbouncer-operator's tox -e integration had 13 failures (25 xfailed, 5 errors) locally, but none matching the expected one - I think perhaps my local Juju version isn't the one expected. I'll try that in CI in my fork also.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix and the tests.

Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's confirm the observation counts for partially deferred events at today's standup.

@dimaqq
Copy link
Contributor

dimaqq commented Feb 5, 2025

P.S. confirming that the change fixes my reproducer.

@tonyandrewmeyer
Copy link
Contributor Author

Reproduced failure in pgbouncer. However, getting the integration tests to run with a branch is still proving difficult, so I'll instead do the release and then do a final verify after that.

@tonyandrewmeyer tonyandrewmeyer merged commit d0f9f50 into canonical:main Feb 5, 2025
32 checks passed
@tonyandrewmeyer tonyandrewmeyer deleted the fix-defer-dedupe-issues branch February 5, 2025 02:47
@tonyandrewmeyer
Copy link
Contributor Author

Reproduced failure in pgbouncer. However, getting the integration tests to run with a branch is still proving difficult, so I'll instead do the release and then do a final verify after that.

pgbouncer seems good now as does postgresql-k8s 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unhandled exception during refresh rollback of a charm
3 participants