[Tracking] Collect & visualise sustainability-related metrics #20

nikimanoledaki · 2024-01-10T12:22:24Z

This issue aims to investigate the sustainability-related metrics that could be implemented as part of our reference architecture.

The WG has so far identified the following use cases that each require a slightly different set of metrics:

SRE Metrics

Metrics used by CNCF project maintainers to make improvements at the application level. For example, as mentioned by @incertum in the issue linked before: Falco's own internal metrics (CPU, memory, and counters), traditional SRE metrics (CPU/mem usage), and energy metrics.

More information about this can be found in the Metrics section of the Green Reviews design document.

CPU usage
- Typically measured as a percentage of one CPU, it can be compared with the number of available CPUs on the host. Falco's hot path is single-threaded, so it should not be able to exceed the capacity of one full CPU.
Memory RSS
- Resident Set Size is the portion of memory held in RAM by a process.
Memory VSZ
- Virtual Memory Size is the total memory allocated to a process, including both RAM and swap space.
container_memory_working_set_bytes in Kubernetes settings
- This is almost equivalent to the cgroups container memory_used metric natively exposed in Falco metrics.
Traffic rate
- packets/second

Sustainability Metrics

SCI score: [Tracking] Collect & visualise SCI score of Falco #33
Impact framework

Other emerging indices that can be used to assess an application's sustainability footprint may also be considered in the future.

Benchmark-Specific Metrics

Metrics to setup the benchmark tests for each CNCF Project.

[Tracking] Create Optimal Synthetic Workloads / "Kernel Event Rates" for Falco Testing falcosecurity/cncf-green-review-testing#11

These metrics are often inter-related. For example, data about energy consumption can be used in each of these scenarios.

This issue can be used to track the ideas and discussions for which metrics the Green Reviews pipeline should track. That being said, prioritisation is key so that the WG remains on track with the milestones that were set in the Roadmap by the group.

nikimanoledaki · 2024-01-30T16:01:53Z

Looking at SRE Metrics, @incertum, do you already have a Grafana dashboard for these metrics? We would need to either create Prometheus queries or access them through the Falco internal metrics.

incertum · 2024-01-30T19:04:09Z

@nikimanoledaki Falco does not yet have a Prometheus exporter, perhaps for Falco 0.38 in May we may have it, I need to check with the other maintainers. Meanwhile, we have Falco metrics as internal Falco rules that can be piped to logrotated files (JSONL formatted).

Proposing to make the CNCF SRE Metrics independent of Falco or Falco's Metrics and report CPU and memory usages of project binaries through your preferred framework as well as creating your preferred Grafana dashboards. WDYT?

nikimanoledaki · 2024-01-31T11:16:23Z

I wonder if there are any useful metrics in the default metrics of Kubernetes, for example:

from the native components
- https://kubernetes.io/docs/concepts/cluster-administration/system-metrics/
- https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml
from kube-state-metrics (ksm) - more likely to find something here: https://github.com/kubernetes/kube-state-metrics/tree/main

It would be nice to somehow surface the internal Falco metrics that way, but I'm not sure if that would be possible since those would be logs, not metrics.

What is the filesystem location where the internal Falco metrics are exported? These metrics are at the Pod level, correct?

Which Falco Metrics would you find useful or relevant for either 1) performance monitoring or 2) setting up the benchmark tests?

Looking at this, I imagine "kernel.evt_rate" is one that we would definitely need for the benchmark tests.

AntonioDiTuri · 2024-02-06T17:08:08Z

I created two deep-dive ticket on the steps to collect the metrics and visualize them.
I made a distinction between Kepler and Kubernetes related metrics which have a more standard approach and Falco that needs some more thought on the process, hope that it is clear, please let me know

nikimanoledaki added board/wg-green-reviews priority/important-soon labels Jan 10, 2024

nikimanoledaki changed the title ~~[Tracking] Collect sustainability-related metrics~~ [Tracking] Identify which sustainability-related metrics to collect Jan 10, 2024

nikimanoledaki added the area/metrics label Jan 10, 2024

nikimanoledaki changed the title ~~[Tracking] Identify which sustainability-related metrics to collect~~ [Tracking] Identify metrics to collect Jan 16, 2024

nikimanoledaki changed the title ~~[Tracking] Identify metrics to collect~~ [Tracking] Collect & visualise sustainability-related metrics Jan 17, 2024

rossf7 mentioned this issue Jan 23, 2024

[Action] Make Grafana dashboards publicly accessible from cluster #31

Open

4 tasks

AntonioDiTuri mentioned this issue Jan 24, 2024

[Tracking] Gather metrics for idle Falco #34

Open

4 tasks

nikimanoledaki added this to the Measure the cloud native sustainability footprint of Falco manually milestone Jan 24, 2024

nikimanoledaki modified the milestones: [Q1 24] Measure the cloud native sustainability footprint of Falco manually, [Q2 24] Deploy, Run, Report: Automate the sustainability footprint pipeline Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking] Collect & visualise sustainability-related metrics #20

[Tracking] Collect & visualise sustainability-related metrics #20

nikimanoledaki commented Jan 10, 2024 •

edited

nikimanoledaki commented Jan 30, 2024

incertum commented Jan 30, 2024

nikimanoledaki commented Jan 31, 2024 •

edited

AntonioDiTuri commented Feb 6, 2024

[Tracking] Collect & visualise sustainability-related metrics #20

[Tracking] Collect & visualise sustainability-related metrics #20

Comments

nikimanoledaki commented Jan 10, 2024 • edited

SRE Metrics

Sustainability Metrics

Benchmark-Specific Metrics

nikimanoledaki commented Jan 30, 2024

incertum commented Jan 30, 2024

nikimanoledaki commented Jan 31, 2024 • edited

AntonioDiTuri commented Feb 6, 2024

nikimanoledaki commented Jan 10, 2024 •

edited

nikimanoledaki commented Jan 31, 2024 •

edited