Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Kubernetes monitoring play #392

Merged
merged 1 commit into from
Mar 19, 2024
Merged

Add Kubernetes monitoring play #392

merged 1 commit into from
Mar 19, 2024

Conversation

matofeder
Copy link
Contributor

@matofeder matofeder commented Mar 19, 2024

This PR adds an initial version of Kubernetes monitoring play.

Tesbed test

  1. Insert play into the custom env play: /opt/configuration/environments/custom/playbook-kubernetes-monitoring.yml
  2. Execute play
$ osism apply -e custom kubernetes-monitoring
2024-03-19 11:30:49 | INFO     | Trying to run play kubernetes-monitoring in environment custom
2024-03-19 11:30:49 | INFO     | Task was prepared for execution.
2024-03-19 11:30:50 | INFO     | It takes a moment until the task has been started and output is visible here.

PLAY [Apply kubernetes monitoring role] ****************************************

TASK [Gathering Facts] *********************************************************
Tuesday 19 March 2024  11:30:55 +0000 (0:00:02.586)       0:00:02.586 ********* 
ok: [localhost]

TASK [Deploy kubernetes-monitoring helm chart] *********************************
Tuesday 19 March 2024  11:30:59 +0000 (0:00:04.109)       0:00:06.695 ********* 
changed: [localhost]

PLAY RECAP *********************************************************************
2024-03-19 11:32:08 | INFO     | Play has been completed. There may now be a delay until all logs have been written.
2024-03-19 11:32:08 | INFO     | Please wait and do not abort execution.
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Tuesday 19 March 2024  11:32:08 +0000 (0:01:08.368)       0:01:15.064 ********* 
=============================================================================== 
Deploy kubernetes-monitoring helm chart -------------------------------- 68.37s
Gathering Facts --------------------------------------------------------- 4.11s
  1. Validates deployment
$ kubectl -n kubernetes-monitoring get po 
NAME                                                        READY   STATUS    RESTARTS       AGE
alertmanager-kube-prometheus-alertmanager-0                 2/2     Running   0              3m50s
dnation-kubernetes-jsonnet-translator-797fbf8ddb-7t7wm      1/1     Running   1 (99s ago)    3m53s
kube-prometheus-operator-85cbd9ffcd-nhz6q                   1/1     Running   0              3m53s
kubernetes-monitoring-grafana-566d9d4f65-7g2qm              3/3     Running   4 (101s ago)   3m53s
kubernetes-monitoring-kube-state-metrics-79b6bfbc46-rjpqj   1/1     Running   0              3m53s
kubernetes-monitoring-prometheus-node-exporter-jmdsd        1/1     Running   0              3m53s
kubernetes-monitoring-prometheus-node-exporter-x8dzb        1/1     Running   0              3m53s
kubernetes-monitoring-prometheus-node-exporter-zh62s        1/1     Running   0              3m53s
kubernetes-monitoring-prometheus-node-exporter-zh64k        1/1     Running   0              3m53s
kubernetes-monitoring-thanos-query-6c979c7cc7-hg2df         1/1     Running   0              3m53s
prometheus-kube-prometheus-prometheus-0                     3/3     Running   0              3m50s
  1. Get helm chart notes
$ helm -n kubernetes-monitoring get notes kubernetes-monitoring
NOTES:
dNation Kubernetes Monitoring Stack has been installed.
     _ _   _       _   _                __  __             _ _             _                _____ _             _
    | | \ | |     | | (_)              |  \/  |           (_) |           (_)              / ____| |           | |
  __| |  \| | __ _| |_ _  ___  _ __    | \  / | ___  _ __  _| |_ ___  _ __ _ _ __   __ _  | (___ | |_ __ _  ___| | __
 / _` | . ` |/ _` | __| |/ _ \| '_ \   | |\/| |/ _ \| '_ \| | __/ _ \| '__| | '_ \ / _` |  \___ \| __/ _` |/ __| |/ /
| (_| | |\  | (_| | |_| | (_) | | | |  | |  | | (_) | | | | | || (_) | |  | | | | | (_| |  ____) | || (_| | (__|   <
 \__,_|_| \_|\__,_|\__|_|\___/|_| |_|  |_|  |_|\___/|_| |_|_|\__\___/|_|  |_|_| |_|\__, | |_____/ \__\__,_|\___|_|\_\
                                                                                    __/ |
Visit https://www.dNation.cloud/ for detailed information.                         |___/
If you're experiencing issues please read the project documentation and FAQ.

1. Check its status by running:

    kubectl --namespace kubernetes-monitoring get pods

4. Get your 'admin' user password by running:

    kubectl --namespace kubernetes-monitoring get secret kubernetes-monitoring-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo

5. If you didn't modify the default values the Grafana server is exposed by ClusterIP service and can be accessed via port 80 on the following DNS name from within your cluster:

     kubernetes-monitoring-grafana.kubernetes-monitoring.svc.cluster.local

   Use Port Forwarding if you want to access the Grafana server from outside your cluster:

     export POD_NAME=$(kubectl get pods --namespace kubernetes-monitoring -l "app.kubernetes.io/name=grafana,app.kubernetes.io/instance=kubernetes-monitoring" -o jsonpath="{.items[0].metadata.name}")
     kubectl --namespace kubernetes-monitoring port-forward $POD_NAME 3000

6. Login with the password from step 2 and the username: 'admin'

7. Search for `Monitoring` dashboard in the `dNation` directory. The fun starts here :)
  1. Follow the instructions above, expose and access the GrafanaUI

image

Known issues

  1. K3s control plane components kube-scheduler, kube-proxy, and kube-controller do not expose metrics endpoints (by default)
    Those metrics endpoints should be enabled, see here
    As a result, Kubernetes monitoring show the above CP components as follows:
    image

AI: Adjust k3s configuration and enable CP components metrics endpoints

  1. The Prometheus Node Exporter could be deployed as a separate component in a testbed, hence, it is not necessary to deploy it again using the Kubernetes monitoring stack. The existing deployment of Node Exporter could simply be wired to the Kubernetes monitoring stack as follows:
kube-prometheus-stack:
  nodeExporter:
    enabled: false
  prometheus:
    prometheusSpec:
      additionalScrapeConfigs:
        - job_name: node-exporter
          metrics_path: /metrics
          static_configs:
            - labels:
                pod: "external"  # FIXME: see https://github.com/dNationCloud/kubernetes-monitoring/issues/198
              targets:
              - "testbed-manager:9100"
              - "testbed-node-0:9100"
              - "testbed-node-1:9100"
              - "testbed-node-2:9100"

AI: Create a separate role for Kubernetes monitoring stack deployment and add conditions to enable/disable Node Exporter deployment by this stack if the mentioned (existing) node-exporter testbed deployment is enabled/disabled

@matofeder matofeder requested a review from berendt March 19, 2024 11:56
@matofeder matofeder added the SCS Sovereign Cloud Stack label Mar 19, 2024
@berendt berendt merged commit 6be91a9 into main Mar 19, 2024
2 checks passed
@berendt berendt deleted the k8s-monitoring branch March 19, 2024 16:35
matofeder added a commit that referenced this pull request Apr 30, 2024
[PR#392](#392) added
Kubernetes monitoring deployment play. Kubernetes monitoring is
intended to monitor the k3s instance control plane components out of
the box.
PR#392 observes several issues that are mentioned in the
PR's descriptioni, which are related to the k3s distribution.

These issues are addressed within this commit.

Signed-off-by: Matej Feder <[email protected]>
matofeder added a commit that referenced this pull request Apr 30, 2024
[PR#392](#392) added
Kubernetes monitoring deployment play. Kubernetes monitoring is
intended to monitor the k3s instance control plane components out of
the box.
PR#392 observes several issues that are mentioned in the
PR's descriptioni, which are related to the k3s distribution.

This commit  addresses the 1. issue where the monitoring solution
could not monitor K3s control plane components kube-scheduler,
kube-proxy, and kube-controller-manager.

The second part of the fix: osism/defaults#177

Signed-off-by: Matej Feder <[email protected]>
matofeder added a commit that referenced this pull request Apr 30, 2024
[PR#392](#392) added
Kubernetes monitoring deployment play. Kubernetes monitoring is
intended to monitor the k3s instance control plane components out of
the box.
PR#392 observes several issues that are mentioned in the
PR's descriptioni, which are related to the k3s distribution.

This commit  addresses the 1. issue where the monitoring solution
could not monitor K3s control plane components kube-scheduler,
kube-proxy, and kube-controller-manager.

The second part of the fix: osism/defaults#177

Signed-off-by: Matej Feder <[email protected]>
matofeder added a commit that referenced this pull request Apr 30, 2024
[PR#392](#392) added
Kubernetes monitoring deployment play. Kubernetes monitoring is
intended to monitor the k3s instance control plane components out of
the box.
PR#392 observes several issues that are mentioned in the
PR's descriptioni, which are related to the k3s distribution.

This commit  addresses the 1. issue where the monitoring solution
could not monitor K3s control plane components kube-scheduler,
kube-proxy, and kube-controller-manager.

The second part of the fix: osism/defaults#177

Signed-off-by: Matej Feder <[email protected]>
berendt pushed a commit that referenced this pull request Apr 30, 2024
[PR#392](#392) added
Kubernetes monitoring deployment play. Kubernetes monitoring is
intended to monitor the k3s instance control plane components out of
the box.
PR#392 observes several issues that are mentioned in the
PR's descriptioni, which are related to the k3s distribution.

This commit  addresses the 1. issue where the monitoring solution
could not monitor K3s control plane components kube-scheduler,
kube-proxy, and kube-controller-manager.

The second part of the fix: osism/defaults#177

Signed-off-by: Matej Feder <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SCS Sovereign Cloud Stack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants