Grafana dashboard #13

pryorda · 2018-06-29T00:35:47Z

From @rverchere on June 27, 2017 21:7

Add grafana dashboard using this exporter.

Copied from original issue: rverchere/vmware_exporter#8

akurach · 2018-09-09T16:52:48Z

I think its very personal. And based on your VC config. I created smthing like this

pryorda · 2018-09-09T17:07:27Z

@akurach looks good. Want to do a PR for it? I want to get mine added, but yours looks better. Did you build any alert manager rules for it?

Currently we have this:

ALERT Host_Warn_Cpu_Usage
  IF
    avg(vmware_host_cpu_usage / vmware_host_cpu_max) by (host_name, environment) * 100 >= 80
  FOR 30m
  LABELS {
    severity = "warning",
    alert_category = "vmware",
    instance = "{{ $labels.host_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "High cpu usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%",
    description = "High cpu usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%"
  }

ALERT Host_Crit_Cpu_Usage
  IF
    avg(vmware_host_cpu_usage / vmware_host_cpu_max) by (host_name, environment) * 100 >= 95
  FOR 10m
  LABELS {
    severity = "critical",
    alert_category = "vmware",
    instance = "{{ $labels.host_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "High cpu usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%",
    description = "High cpu usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%"
  }

ALERT Host_Warn_Mem_Usage
  IF
    avg(vmware_host_memory_usage / vmware_host_memory_max) by (host_name, environment) * 100 >= 80
  FOR 30m
  LABELS {
    severity = "warning",
    alert_category = "vmware",
    instance = "{{ $labels.host_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "High memory usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%",
    description = "High memory usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%. Consider rebalancing of virtualmachines on the cluster in vmware."
  }

ALERT Host_Crit_Mem_Usage
  IF
    avg(vmware_host_memory_usage / vmware_host_memory_max) by (host_name, environment) * 100 >= 98
  FOR 5m
  LABELS {
    severity = "critical",
    alert_category = "vmware",
    instance = "{{ $labels.host_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "High memory usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%",
    description = "High memory usage on {{ $labels.host_name }}: {{ $value | printf \"%.2f\" }}%. Rebalance virtrtual machines on the cluster in vmware."
  }

ALERT Predict_Disk_Space_Warn
  IF
    (avg(vmware_datastore_freespace_size) by (ds_name, environment, vcenter_host)/((avg(vmware_datastore_freespace_size offset 7d ) by (ds_name, environment, vcenter_host) - avg(vmware_datastore_freespace_size) by (ds_name, environment, vcenter_host))/7+1) >= 0) <= 1
  FOR 60m
  LABELS {
    severity = "warning",
    alert_category = "vmware",
    instance = "{{ $labels.vcenter_host }}:{{ $labels.ds_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "Disk space on vmware datastore {{ $labels.ds_name }} could run out in {{ $value | printf \"%.2f\" }} days",
    description = "Disk space on vmware datastore {{ $labels.ds_name }} could run out in {{ $value | printf \"%.2f\" }} days"
  }

ALERT Predict_Disk_Space_Crit
  IF
    (avg(vmware_datastore_freespace_size) by (ds_name, environment, vcenter_host)/((avg(vmware_datastore_freespace_size offset 6h ) by (ds_name, environment, vcenter_host) - avg(vmware_datastore_freespace_size) by (ds_name, environment, vcenter_host))/6 + 1) >= 0) <= 6
  FOR 5m
  LABELS {
    severity = "critical",
    alert_category = "vmware",
    instance = "{{ $labels.vcenter_host }}:{{ $labels.ds_name }}",
    team = "prod-services",
    run_book = "exampe.com/wiki/what-to-do"
  }
  ANNOTATIONS {
    summary = "Disk space on vmware datastore {{ $labels.ds_name }} could run out in {{ $value | printf \"%.2f\" }} hours",
    description = "Disk space on vmware datastore {{ $labels.ds_name }} could run out in {{ $value | printf \"%.2f\" }} hours"
  }

akurach · 2018-09-09T17:10:06Z

I can try to make my dash more templated....

About alerts - now i use default grafana alerts via pushover and telegram. But i want to migrate to alertmanager in future.

pryorda · 2018-09-09T17:13:23Z

@akurach No rush, I think we will find use in your dashboard.

pryorda · 2018-10-18T00:15:03Z

@akurach Any progress on this?

pete-leese · 2019-01-22T22:57:28Z

Has anyone got any good example dashboards to share ? Cheers.

pryorda · 2019-01-23T04:37:04Z

There should be one in the grafana dir

pete-leese · 2019-01-23T13:22:47Z

Yes I saw the esx hosts dashboard but I was wondering if there was any more examples such as Datastore storage, iops, vm’s before I got stuck into making my own.

pryorda · 2019-01-26T04:10:34Z

At this time there is not.

akurach · 2019-06-28T13:19:53Z

for now I've got something like this

I can give examples of some of them)

icanhazbeer · 2019-07-24T20:53:04Z

Hello,
Are there any examples or templates out there using this exporter? I could not find any on the grafana website.

noesberger · 2019-09-23T09:00:39Z

for now I've got something like this

I can give examples of some of them)

Hi

Can you provide the dashboards you've build in Grafana for the Prometheus VMware Exporter? They look great.

PickingUpPieces · 2019-10-09T11:45:16Z

@akurach They look great actually!
Is it possibly to contribute those?

dannyk81 · 2019-10-09T14:09:06Z

I always find that it's almost impossible to provide a one-size-fit-all dashboard that will work well for everyone, since our monitoring/observability requirements differ and we all look at slightly different things.

My recommendation here would be to suggest users that would like to share/contribute their dashbaords to use Grafana's dashboard library - https://grafana.com/grafana/dashboards

This way we can have multiple variants published and maintained in an appropriately designed repository so that users can mix-and-match or use the variant that better fits their needs.

@pryorda this seems a more sensible approach to me, wdyt?

pryorda · 2019-10-09T20:05:57Z

I think having a standard template would be good. RAM/CPU/DISK. Other then that you're correct.

PickingUpPieces · 2019-10-16T14:52:42Z

@dannyk81 I feel like, that people feel more responsible to update their dashboards, if they're in the git repository than on grafana dashboards. I'm with @pryorda on this, I would even like a more stacked version, so everyone can just delete panels that they don't need.

dannyk81 · 2019-10-16T15:12:51Z

well, I see at least 3 versions above and my own dashboards are quite different... even display RAM/CPU/DISK can be done in so many ways 😄

I would instead create list of sample queries and alerts which focus on various aspects of the system's health, users can then use these samples to build their alerting rules and dashboards.

again, this is my preference and 2 cents on this subject...

PickingUpPieces · 2019-10-18T15:16:39Z

Sounds alright to me :) But I'm still thinking, that some finished example boards which are working out of the box (more or less), would be pretty helpful for newbies

medeirosjrm · 2021-09-08T21:24:39Z

pryorda , I found the alerts created in the above example in response to @akurach very good, I have great difficulty creating these alerts for disk space and virtual machines down, would you have any example?

Thank you very much in advance

pryorda · 2022-02-14T18:30:46Z

I would start by understanding the metrics you are wanting to grab and then possibly look at the predictive alerts in prometheus. Let me know if this doesn't help

pryorda added hacktoberfest help wanted Extra attention is needed labels Jun 29, 2018

pryorda mentioned this issue Jun 29, 2018

Grafana dashboard rverchere/vmware_exporter#8

Open

pryorda added the enhancement New feature or request label Jun 29, 2018

pryorda closed this as completed Feb 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grafana dashboard #13

Grafana dashboard #13

pryorda commented Jun 29, 2018

akurach commented Sep 9, 2018 •

edited

Loading

pryorda commented Sep 9, 2018 •

edited

Loading

akurach commented Sep 9, 2018

pryorda commented Sep 9, 2018

pryorda commented Oct 18, 2018

pete-leese commented Jan 22, 2019

pryorda commented Jan 23, 2019

pete-leese commented Jan 23, 2019

pryorda commented Jan 26, 2019

akurach commented Jun 28, 2019

icanhazbeer commented Jul 24, 2019

noesberger commented Sep 23, 2019

PickingUpPieces commented Oct 9, 2019

dannyk81 commented Oct 9, 2019

pryorda commented Oct 9, 2019

PickingUpPieces commented Oct 16, 2019

dannyk81 commented Oct 16, 2019

PickingUpPieces commented Oct 18, 2019

medeirosjrm commented Sep 8, 2021

pryorda commented Feb 14, 2022

Grafana dashboard #13

Grafana dashboard #13

Comments

pryorda commented Jun 29, 2018

akurach commented Sep 9, 2018 • edited Loading

pryorda commented Sep 9, 2018 • edited Loading

akurach commented Sep 9, 2018

pryorda commented Sep 9, 2018

pryorda commented Oct 18, 2018

pete-leese commented Jan 22, 2019

pryorda commented Jan 23, 2019

pete-leese commented Jan 23, 2019

pryorda commented Jan 26, 2019

akurach commented Jun 28, 2019

icanhazbeer commented Jul 24, 2019

noesberger commented Sep 23, 2019

PickingUpPieces commented Oct 9, 2019

dannyk81 commented Oct 9, 2019

pryorda commented Oct 9, 2019

PickingUpPieces commented Oct 16, 2019

dannyk81 commented Oct 16, 2019

PickingUpPieces commented Oct 18, 2019

medeirosjrm commented Sep 8, 2021

pryorda commented Feb 14, 2022

akurach commented Sep 9, 2018 •

edited

Loading

pryorda commented Sep 9, 2018 •

edited

Loading