Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Investigate] Discover relevant dashboards (and visualizations within them) for a single alert for the custom threshold rule (exclusive of LLM aid) #209046

Open
Tracked by #209119 ...
dominiqueclarke opened this issue Jan 31, 2025 · 2 comments
Assignees
Labels
Team:obs-ux-management Observability Management User Experience Team

Comments

@dominiqueclarke
Copy link
Contributor

dominiqueclarke commented Jan 31, 2025

When SREs are responding to alerts, we want to surface the most helpful, relevant information as quickly as possible to reduce the time to resolution.

Often, our customers have already brought useful information together as part of a dashboard.

To aid in the investigation of the root cause of an alert, we will surface links to relevant dashboards in the alert details page.

This issue relates exclusively to process of determine what a relevant dashboard might be, and the underlying logic associating an alert with that data.

Scope

This ticket is exclusively scoped to developing the underlying API for detecting relevant dashboards, starting with a single alert for the custom threshold. The API should be designed in such a way that it can anticipate the requirement to handle multiple alerts of differencing rule types, but will only be expected to work within the specified scope.

This ticket will not encompass any UI changes or new features.

This ticket will aim to find related dashboards exclusively by inspecting the lens visualizations within the dashboard. Other embeddable types are out of scope for this issue.

Solving the issue specifically for lens visualizations will give us baseline to learn from to help us think through how we might be able to inspect other embeddable types from dashboard panels.

Comparing alert fields to dashboard grouping keys within lens visualizations

We can start by taking a look at AAD within the alert document. We'll then crawl through available dashboards looking specifically at the lens visualizations. If there's a key within the AAD that matches a grouping key for a lens visualization within a dashboard, we'll note that dashboard being of interest.

Identifying relevant lens visualizations within a dashboard

As an MVP, visualizations will be marked relevant if they include a grouping key that matches a field in the AAD.

As a future enhancement, we'll attempt to calculate the relevance of the visualization based attempts to identify if the visualization contains data for the value of the matching grouping key.

Risks

It's not clear how to transform the lens spec into a usable ES query in order to fully determine if the dashboard actually contains data matching the value for the key/field. We'll start by interfacing with the Visualize team to understand, given a lens spec, how can we determine the underlying ES query.

Open questions

Should we crawl through the dashboards only within the current space, or all spaces that the user has access to?

@dominiqueclarke dominiqueclarke added the Team:obs-ux-management Observability Management User Experience Team label Jan 31, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

@maryam-saeidi
Copy link
Member

We had a similar discussion with @maciejforcone previously, and one concern was about charts that possibly match an alert to some extent, but for some reason, the chart will show an error/no data. Is my understanding correct that this is related to the risks that you mentioned?
I am talking about the following cases: (does it mean, in this case, we also need to know if lens visualization will show the chart successfully, besides querying similar data as the definition of the visualization)

Error No data
Image Image

Also, an interesting idea that was mentioned by Maciej and our SREs was filtering down charts that have a peak/dip at the same time as the alert. This can be useful in:

  1. Narrowing down our initial selection of the charts related to an alert.
  2. Possibly showing unrelated charts to an alert definition, which might be related to the incident.

@dominiqueclarke dominiqueclarke changed the title [Investigate] Find relevant dashboards for a single alert for the custom threshold rule (exclusive of LLM aid) [Investigate] Discover relevant dashboards for a single alert for the custom threshold rule (exclusive of LLM aid) Jan 31, 2025
@dominiqueclarke dominiqueclarke self-assigned this Jan 31, 2025
@dominiqueclarke dominiqueclarke changed the title [Investigate] Discover relevant dashboards for a single alert for the custom threshold rule (exclusive of LLM aid) [Investigate] Discover relevant dashboards (and visualizations within them) for a single alert for the custom threshold rule (exclusive of LLM aid) Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:obs-ux-management Observability Management User Experience Team
Projects
None yet
Development

No branches or pull requests

3 participants