Skip to content

feat: emit start signals for cron job monitoring#934

Open
pythonicrahul wants to merge 3 commits into
DataDog:masterfrom
pythonicrahul:feat/dogwrap-start-signals
Open

feat: emit start signals for cron job monitoring#934
pythonicrahul wants to merge 3 commits into
DataDog:masterfrom
pythonicrahul:feat/dogwrap-start-signals

Conversation

@pythonicrahul
Copy link
Copy Markdown

@pythonicrahul pythonicrahul commented Apr 4, 2026

What does this PR do?

Adds start signals to dogwrap so users can monitor cron jobs that were never triggered or failed to run — enabling proper cron monitoring like established tools such as Dead Man's Snitch and Cronitor do, natively within Datadog.

Fixes #933

Description of the Change

Two start signals are emitted before execute(), gated on existing flags (no new CLI surface):

  1. dogwrap.started gauge metric (value=1) — fires when --send_metric is set
  2. Info start event — fires when --submit_mode all is set

A minor refactor moves initialize(), host/site resolution, and tag construction before execute() since start signals need the API client ready. These blocks don't depend on execution results.

Behavior matrix:

--submit_mode --send_metric start metric start event
errors False no no
errors True yes no
all False no yes
all True yes yes

Alternate Designs

  • New CLI flag (e.g. --emit_start): Rejected — reusing existing flags keeps the CLI surface unchanged.
  • Single start event only: A metric is more useful for "no data" monitors, so both are provided.

Possible Drawbacks

  • One extra API call before execute() when start signals are enabled. Negligible latency for cron job use cases.

Verification Process

  • 5 new unit tests in tests/unit/dogshell/test_wrap.py covering all behavior matrix combinations
  • All 12 existing tests/unit/dogwrap/test_dogwrap.py tests pass unchanged
  • Live-tested all 4 matrix combinations against a Datadog US5 account, confirmed metrics and events arrived correctly

Additional Notes

No changes to execute(), poll_proc(), OutputReader, build_event_body(), trim_text(), or parse_options().

Release Notes

dogwrap now emits a dogwrap.started metric and an info start event before command execution, enabling monitoring for cron jobs that fail to start.

Add dogwrap.started gauge metric and info start event before execute()
to enable dead man's snitch monitoring for cron jobs that never start.

Start metric fires when --send_metric is set. Start event fires when
--submit_mode all is set. No new CLI flags. No behavior change for
existing users.

Fixes DataDog#933
@pythonicrahul pythonicrahul requested a review from a team as a code owner April 4, 2026 12:02
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f7f8c35bb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread datadog/dogshell/wrap.py Outdated
…ute()

API failures (timeout, bad key, unreachable endpoint) on start signals
must not prevent the wrapped command from running. Adds two tests
verifying execute() is called even when start metric or event raises.

Addresses review feedback on DataDog#934
@pythonicrahul
Copy link
Copy Markdown
Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08fe228dbf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread datadog/dogshell/wrap.py Outdated
Comment thread datadog/dogshell/wrap.py
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

This issue has been automatically marked as stale because it has not had activity in the last 30 days.
Note that the issue will not be automatically closed, but this notification will remind us to investigate why there's been inactivity.

@github-actions github-actions Bot added the stale Stale - Bot reminder label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Stale - Bot reminder

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dogwrap: emit start signals for cron job monitoring

1 participant