Skip to content

[IPC Protocol] Add specification to configure a user_events eventpipe session #5454

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

mdh1418
Copy link
Member

@mdh1418 mdh1418 commented Apr 21, 2025

In .NET 10, the EventPipe infrastructure will be leveraged to support user_events.

This PR documents the protocol for enabling a user_events-based EventPipe Session through the Diagnostics IPC protocol, where a new EventPipe Command ID CollectTracing5 will accept necessary tracepoint configuration.

As the user_events EventPipe session is not streaming based, the payload is expected to first encode a uint output_format to denote the session format (streaming vs user_events). Afterwards, only relevant session configuration options are to be encoded, outlined at the top of the EventPipe Commands section in the Streaming Session section, User_events Session section, and Session Providers section.

For user_events EventPipe sessions, an additional tracepoint_config is to be encoded, to map Event IDs to tracepoints.
This protocol expects the Client to have access to the user_events_data file in order to enable configuring a user_event EventPipe session, and expects the SCM_RIGHTS to the user_events_data file descriptor to be sent in the continuation stream.

Additionally, as user_events does not support callbacks, a new event_filter config is expected to be encoded in CollectTracing5, to act as an allow/deny list of Event IDs that will apply post keyword/level filter.

@mdh1418 mdh1418 requested a review from a team as a code owner April 21, 2025 23:53
@mdh1418 mdh1418 force-pushed the add_ipc_message_protocol_for_user_events_ep_session branch from 722afa6 to 4094fe9 Compare April 21, 2025 23:54
mdh1418 added 2 commits April 25, 2025 12:19
Update CommandSets Command IDs
Move EventPipe StopTracing to beginning
Fix sample payload serialization
Clarify Header Size and NetTrace format version
Clarify that filter_data can be 0 length to avoid confusion with
optional meaning that encoding can be skipped
@mdh1418 mdh1418 force-pushed the add_ipc_message_protocol_for_user_events_ep_session branch from c30b773 to 8b5bf46 Compare April 25, 2025 16:26
@mdh1418 mdh1418 force-pushed the add_ipc_message_protocol_for_user_events_ep_session branch from 1ba9499 to 87902f0 Compare May 1, 2025 00:56
@mdh1418 mdh1418 requested a review from brianrob May 6, 2025 14:46
@mdh1418 mdh1418 requested a review from agocke May 6, 2025 20:51
Copy link
Member

@brianrob brianrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdh1418 thanks for updating this doc and for your patience with me reviewing it. A couple of questions, but otherwise, looks good.


The `version` is the version of the tracepoint format, which in this case is [version 1](#tracepoint-format-v1).

The `event_id` is the ID of the event, defined by the EventSource/native manifest.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We include the event id here, but I don't see anything that also includes the provider ID. To my eyes, it looks like we will support sending events from multiple providers through the same tracepoint, and if so, what's the right way to tell which provider the event belongs to? Or maybe I just missed a detail somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. Currently, the tracepoint configuration details are per provider, but I haven't accounted for multiple providers trying to use the same tracepoint names. So in the rutnime PR's current state, it will register tracepoints per provider and there aren't guards against duplicate tracepoint names.

Do we want to have multiple providers write to the same tracepoint? Or did we solely want to allow multiple events from the same provider to be written to a common tracepoint?

From a glance, it doesn't look like an EventPipe Provider actually has an ID, it just has its name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my eyes, it looks like we will support sending events from multiple providers through the same tracepoint, and if so, what's the right way to tell which provider the event belongs to?

We might technically support it, but we didn't expect any tool to do it because of that ambiguity :) The intended usage is that tools will assign one or more unique tracepoints to each provider.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool. Just wanted to make sure since we'll need to know the provider on the consumer side. Do we want to include any metadata about the provider for the future?

Copy link
Member

@noahfalk noahfalk May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent is that the trace initiator will decide the mapping from tracepoint name -> provider so implicitly the metadata is already there in the name. If multiple consumers want to read the data then all of them will need to agree on some naming convention. In the future we could create a more standardized representation of the provider information in the trace data, but it increases the size of every event and it wasn't yet clear who would benefit from it. We have the option to rev the format or to try embedding it in a back-compatible way in the extension block.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like there will be a tight connection by the trace initiator and the parsing of the trace in order to figure out how to map events to their actual representation. Will we have some default tracepoint names or must each session define their own mapping for everything it is interested in? How will tools collect the events? I guess they need to enable what they are interested in collecting, will that be done per tracepoint name? Since we include the event id in the event, it sounds like the tracepoint name must correspond to provider in one way or another.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we have some default tracepoint names or must each session define their own mapping for everything it is interested in?

Currently we are requiring users to be explicit about what tracepoints they want registered. I'm thinking a common default tracepoint might lead to "noisy" tracepoints. e.g. One user enables a session with provider A and another user enables a session with provider B. As the events from both providers will flow to the same tracepoint, then both users will unexpectedly see events they didn't subscribe to.

How will tools collect the events? I guess they need to enable what they are interested in collecting, will that be done per tracepoint name?

@brianrob and @beaubelgrave are implementing the collector side. If I understood correctly, the collector will register the tracepoints and then inform the runtime of what tracepoints to register and write to based on all of the configuration being passed into the IPC channel.

The runtime will check the enable bit of the tracepoint to see if there are any listeners (e.g. https://github.com/torvalds/linux/blob/master/samples/user_events/example.c), so tooling will tell the runtime what its interested in collecting for the runtime to write accordingly. I do think the tool will need to register every tracepoint that they want the runtime to register as well as the format string for that tracepoint.

Since we include the event id in the event, it sounds like the tracepoint name must correspond to provider in one way or another.

I think for simplicity and for best practices with the runtime, we will suggest users to associate each provider with its own tracepoint.

Copy link
Member

@lateralusX lateralusX May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we are requiring users to be explicit about what tracepoints they want registered. I'm thinking a common default tracepoint might lead to "noisy" tracepoints. e.g. One user enables a session with provider A and another user enables a session with provider B. As the events from both providers will flow to the same tracepoint, then both users will unexpectedly see events they didn't subscribe to.

I would imagine a default tracepoint name mapping would be the provider name since we carry the event id as part of the payload. In that case, provider A and provider B events will never be mixed since they will always go to different tracepoints.

Copy link
Member

@lateralusX lateralusX May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the tool will need to register every tracepoint that they want the runtime to register as well as the format string for that tracepoint.

Will we actually be using that or let the tool figure that out based on the requested event id? If we would use the format string then you would need individual trace points for every event since they will have different payloads.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine a default tracepoint name mapping would be the provider name since we carry the event id as part of the payload. In that case, provider A and provider B events will never be mixed since they will always go to different tracepoints.

Right, that would make mutually exclusive defaults for the different providers. To clarify, this default tracepoint is referring a default tracepoint for all other events from that provider to be written to in case those IDs weren't specified in tracepoint_sets?

If so, we have an optional default_tracepoint_name that can be used if users really want to have a fallback tracepoint. It was shifted from requiring a real value to being optional from this thread, and to try to not create tracepoints that wouldn't be used.

Will we actually be using that or let the tool figure that out based on the requested event id?

Yes, we will be using the format string. I believe the idea is that profiling tools will use this spec, which has version 1 of tracepoint formats currently as <tracepoint_name> u8 version; u16 event_id; __rel_loc u8[] extension; __rel_loc u8[] payload; __rel_loc u8[] meta to register the tracepoints its interested in, send the IPC command to the runtime to tell it to register the corresponding tracepoints, and that it would error if the formats weren't the same. I'm fuzzy about the tracepoint registering erroring details, @beaubelgrave, did I correctly convey the process?

If we would use the format string then you would need individual trace points for every event since they will have different payloads.

I think we are using the __rel_loc to be able to write variable length event payloads. I believe we are putting the burden on profiling tools to know how to decode event payloads according to EventIDs seen earlier in the entire payload. And if users wanted to, they can already specify a single trace point for every event ID, it would just be an extremely long tracepoint_config.

@steveisok steveisok requested a review from lateralusX May 21, 2025 23:47

The `StopTracing` command is used to stop a specific streaming session. Clients are expected to use this command to stop streaming sessions started with [`CollectStreaming`](#CollectStreaming).
The `CollectTracing5` command is an extension of the `CollectTracing4` command. It has all the capabilities of `CollectTracing4` and introduces new fields to enable a Linux-only user_events-based eventpipe session and to prescribe an allow/deny list for Event IDs. When the user_events-based eventpipe session is enabled, the file descriptor and SCM_RIGHTS of the `user_events_data` file must be sent through the optional continuation stream as [described](#passing_file_descriptor). The runtime will register tracepoints based on the provider configurations passed in, and runtime events will be written directly to the `user_events_data` file descriptor. The allow/deny list of Event IDs will apply after the keyword/level filter to determine whether or not that provider's event will be written.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions around how this type of session works when running with user events. Will no data be passed back over the stream since all data will be routed to the user event file or will events go through eventpipe as well? What happens when the session gets closed, will it unregister all user events registered by the session or will it keep the registered events for the lifetime of the process? Is it possible to run parallel sessions enabling the same user events using the same file descriptor or can only one user event session run at the same time preventing multiple copies of the same events ending up in the trace file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will no data be passed back over the stream since all data will be routed to the user event file or will events go through eventpipe as well?

We are going to leverage eventpipe infrastructure to get the "hit runtime event -> write event" part for free, but we will diverge at write, where we will not use the buffer_manager. I think the only data passed back to the stream will be the eventpipe session ID and success/failure messages.

What happens when the session gets closed, will it unregister all user events registered by the session or will it keep the registered events for the lifetime of the process?

Yes, good point, I need to add that to the runtime implementation that it will unregister the tracepoints that were registered when the session started.

Is it possible to run parallel sessions enabling the same user events using the same file descriptor or can only one user event session run at the same time preventing multiple copies of the same events ending up in the trace file.

From what I can tell, parallel sessions are supported, and I believe its just the user_events_data file descriptor. I believe multiple copies of the same events can end up in the trace file if there are different sessions, I think under normal conditions, they will just be going to different tracepoints.

Copy link
Member

@lateralusX lateralusX May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for my understanding, using different file descriptors, one for each session, does that have any actual value or just a side effect of going over CollectTracing API? Normally the process writing user events will open its own file descriptor to use for register/write/unregister all its user events. I know that we pass in a file descriptor this way to give runtime access to a file descriptor with more privileges than what the process has, but is that a side effect of the design and not that we really need a different file descriptor per session?

Since we won't use the regular dotnet-trace tooling to enable this, and the behavior is different compared to regular trace sessions, runtime will still create a streaming thread for these sessions (could be disabled by using a different session type), it needs to disable buffer manager, some fields will be noop, like buffer manager size. Did we ever discuss handling user events through a different set of API's instead of extending CollectTracing API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out. One problem of the previous user_events implementation in the runtime is that it would fail to enable user_events in the event that the application wasn't started with root permissions. Instead of requiring users to restart their applications to run with elevated permissions or to know how to modify access groups in order to use user_events, we felt that having a file descriptor passed to the runtime would be more forgiving. 1) app restarts are expensive/undesired for some users, 2) If users do have elevated permissions, starting a profiling tool (which they would need to do to read user_events) afterwards will allow them to still opt into user_events, and in this case, configure exactly what events from which providers get written to which tracepoints.

different file descriptors, one for each session, does that have any actual value

No, I believe they should all be user_events_data, but the idea is that we only hold the file descriptor for as long as the session is active, once the profiling tool stops all sessions, the runtime should no longer have access to the root protected file.

dotnet-trace tooling to enable this

This isn't planned for .NET 10, but we are expecting the extend the functionality into the DiagnosticsClient in this repo, once I get around to it.

runtime will still create a streaming thread for these sessions (could be disabled by using a different session type)

Yep, in the runtime draft PR dotnet/runtime#115265, when the output_format is 1 for user_events-based eventpipe session, it will not be a streaming session type. I might have missed where the streaming thread gets created, I'll look for it to make sure its excluded.

it needs to disable buffer manager, some fields will be noop, like buffer manager size.

Right, we are buying into the eventpipe infrastructure, but diverging at the actual write, so instead of the synchronous_callback and the buffer_manager, we will one specifically to write to tracepoints.

Did we ever discuss handling user events through a different set of API's instead of extending CollectTracing API

Very briefly, I think it was just between Noah and me. I think the agreement was that this is just extending the capabilities of creating an eventpipe session, and instead of continuing to create CommandSets that will just have a couple of CommandIDs, we can afford to just extend CollectTracingN for things related to creating eventpipe sessions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different file descriptors, one for each session, does that have any actual value

No, I believe they should all be user_events_data

The file descriptors will point to the same file, but I expect they will be different descriptors. In terms of having value, the major value is the privilege elevation. A small additional value is that sessions may have different lifetimes and sharing a single descriptor across multiple sessions would require ref-counting it to get the lifetime right.

Did we ever discuss handling user events through a different set of API's instead of extending CollectTracing API

In addition to what Mitchell mentioned the main benefit I see is that as we keep adding more capabilities to EventPipe I hope to avoid needing to keep multiple lineages of commands in sync. For example imagine we created CollectUserEvents command instead of CollectTracing5 now. Then in .NET 11 we add support for passing W3C TraceParent ids instead of ActivityIds. We'd have to create both CollectTracing5 and CollectUserEvents2 commands to add the same feature in two different variations of the scenario. The number of tools that issue these commands is probably countable on one hand so I think we are better off optimizing for our own implementation and maintenance costs at the expense of a little more complexity when tool authors need to understand the interface.


The `version` is the version of the tracepoint format, which in this case is [version 1](#tracepoint-format-v1).

The `event_id` is the ID of the event, defined by the EventSource/native manifest.
Copy link
Member

@lateralusX lateralusX May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Events are also versioned, so just having the event id won't be enough to know how to parse the payload. In EventPipe and nettrace format each serialized event has a metadata id associated that is unique to the session stream and will appear before the event in the stream. Inside the metadata record it will carry additional data about the event, like provider name and lots of metadata about the event, including version. How will this work for user events? I see that we have a metadata field, and looking at the description below, it sounds like we will include metadata for the first event of that type per session, similar to the nettrace metadata event. But since each event instance doesn't carry a unique metadata id, how will we lookup metadata for potential different versions of an event?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metadata should be sent once per-event, per-EventPipe-session. (Process id,tracepoint) should be sufficient to resolve a session and (tracepoint,event-id) is sufficient to resolve an event. The combination of process id + event id + tracepoint name should then have equivalent precision as the metadata id used in the nettrace format. If we ever support multiple EventSources with the same name, same event id, but different parameter signatures in the same process it would cause problems and we'd have to address it. No plans to do that though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so that means we would need the tracepoint to be different for different providers, since event id are only unique within a specific provider, mixing events from different providers into the same tracepoint will cause event id collisions. How will we handle different versions of the same event?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will we handle different versions of the same event?

In the same process we don't support different versions of the same event. In separate processes you have the PID to distinguish and each process will emit its own metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so that means we would need the tracepoint to be different for different providers, since event id are only unique within a specific provider, mixing events from different providers into the same tracepoint will cause event id collisions

yep! #5454 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants