feat: emit start signals for cron job monitoring#934
Conversation
Add dogwrap.started gauge metric and info start event before execute() to enable dead man's snitch monitoring for cron jobs that never start. Start metric fires when --send_metric is set. Start event fires when --submit_mode all is set. No new CLI flags. No behavior change for existing users. Fixes DataDog#933
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7f7f8c35bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ute() API failures (timeout, bad key, unreachable endpoint) on start signals must not prevent the wrapped command from running. Adds two tests verifying execute() is called even when start metric or event raises. Addresses review feedback on DataDog#934
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08fe228dbf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
This issue has been automatically marked as stale because it has not had activity in the last 30 days. |
What does this PR do?
Adds start signals to dogwrap so users can monitor cron jobs that were never triggered or failed to run — enabling proper cron monitoring like established tools such as Dead Man's Snitch and Cronitor do, natively within Datadog.
Fixes #933
Description of the Change
Two start signals are emitted before
execute(), gated on existing flags (no new CLI surface):dogwrap.startedgauge metric (value=1) — fires when--send_metricis set--submit_mode allis setA minor refactor moves
initialize(), host/site resolution, and tag construction beforeexecute()since start signals need the API client ready. These blocks don't depend on execution results.Behavior matrix:
--submit_mode--send_metricAlternate Designs
--emit_start): Rejected — reusing existing flags keeps the CLI surface unchanged.Possible Drawbacks
execute()when start signals are enabled. Negligible latency for cron job use cases.Verification Process
tests/unit/dogshell/test_wrap.pycovering all behavior matrix combinationstests/unit/dogwrap/test_dogwrap.pytests pass unchangedAdditional Notes
No changes to
execute(),poll_proc(),OutputReader,build_event_body(),trim_text(), orparse_options().Release Notes
dogwrap now emits a
dogwrap.startedmetric and an info start event before command execution, enabling monitoring for cron jobs that fail to start.