[rush-lib] Ensure async telemetry hooks flush during error report #4666
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Hey team, I raised a question in Zulip about an issue with custom plugins using the the
flushTelemetry
hook:https://rushstack.zulipchat.com/#narrow/stream/262513-general/topic/flushTelemetry.20hook.20receives.20no.20data.20for.20failed.20operations/near/432975837
Summarizing here,
RushCommandLineParser
doesn't wait for theflushTelemetry
async tasks to finish for processes which ended in failure.Details
I created a reproduction for this issue here, with a really simple plugin that just taps the telemetry and re-writes it to a custom file. You should be able to see the following:
rush build --to rush-flush-telemetry-example-plugin
: Task succeeds, and the custom telemetry file gets written with the telemetry report.rush build --to my-app
: Task fails, and the custom file gets written but contains no data.We did a little more digging & it looks like the tap hook actually works, its just the async hooks that aren't getting processed correctly. The approach I used is moving the process exit into the finally of
ensureFlushedAsync
. I think this is probably the right behavior, but this is a change in behavior for failed operations since it'll delay the process terminating until the telemetry hooks can be safely flushed. If this isn't an acceptable change in behavior, could we consider supporting this behind a feature flag? Or if there is a better approach I'm missing very open to suggestions. 🙇How it was tested
There is a new test file I added,
RushCommandLineParserFailureCases
which is separate from the existing test file to avoid the mocking imports. This necessitated moving some of the test setup code into another file so it could be reused.Impacted documentation
None, this addresses an issue with the native telemetry.