The metrics
deployment consists of the whole configuration and stack for monitoring targets that run in the hush-house
cluster.
It's composed of two dependent charts:
- prometheus; and
- grafana.
And a publicly accessible URL for dashboards: https://metrics-hush-house.concourse-ci.org/dashboards
Prometheus enables us to have a place to store timeseries data regarding metrics that get exposed by services in the cluster.
It discovers these targets by looking at specific labels in Kubernetes objects (services and pods).
To have services from your deployment scraped by the Prometheus instance managed by this deployment, include the following labelset:
prometheus.io/scrape
: Activates scraping for the pod (required if you want to have scraping);prometheus.io/path
: If the metrics path is not/metrics
override this; andprometheus.io/port
: If the metrics are exposed on a different port to the service then set this appropriately.
To have pods scraped:
prometheus.io/scrape
: Only scrape pods that have a value oftrue
;prometheus.io/path
: If the metrics path is not/metrics
override this; andprometheus.io/port
: Scrape the pod on the indicated port instead of the default of9102
.
Not being publicly exposed, its console can be accessed through port-forwarding:
# Set up a local proxy that allows us to connect
# to the `metrics-prometheus-server` service by
# hitting `127.0.0.1:9090`.
kubectl port-forward \
--namespace metrics \
service/metrics-prometheus-server \
9090:80
The deployment includes node-exporter
deployed as a DaemonSet, making itself present in every node that is part of the cluster (except masters and tainted nodes not explicitly configured in the values.yaml
file).
Cluster-level metrics can be retrieved from kube-state
, a component also deployed by the metrics
deployment.
The grafana instance is publicly accessible at metrics-hush-house.concourse-ci.org.
This instance does not have persistence enabled so that no state needs to be kept around, having all of its panel configurations provisioned as ConfigMap
objects that contain all of the dashboards and datasource configurations.
The dashboards can be found under ./dashboards
, while the template for the ConfigMap
s can be found under ./templates
(templated by Helm when creating a new revision of a release).
Given that all of the state lives under ./dashboards
and in-place updates are not allowed in the Grafana Web UI, to update a dashboard, copy the JSON configuration and paste it under the corresponding dashboard.
Once a revision gets created, the ConfigMap
update will be noticed by a sidecar container under the Grafana pod and update the instance to catch up with the update.
NOTE: There's an intermittent bug with helm not correctly calculating the patch for configMaps. You'll have to delete the configMap for the dashboard you want to update before running make deploy-metrics
. If grafana still doesn't show your changes then delete the grafana pod and let k8s controller bring it back.